Projects: Difference between revisions
Jump to navigation
Jump to search
Line 17: | Line 17: | ||
The goal is to find all the sentences in a large newspaper archive that contains at least 2 names entities. | The goal is to find all the sentences in a large newspaper archive that contains at least 2 names entities. | ||
These sentences should be posted as pulses. | These sentences should be posted as pulses. | ||
The named entity detection have already been done. The only challenge to retrieve the corresponding sentences in the digitized transcriptions. | |||
Resp. Maud | Resp. Maud | ||
Skills: Python or Java | Skills: Python or Java |
Revision as of 12:38, 28 September 2017
ClioWire platform
On the basis on existing codes, develop API for the other groups (posting pulses, searching pulses) Develop bots to rewrite pulses based on other sources.
Initial code base may be chosen among the following options : Social, Mastodon, etc.
Knowledge required : Python, PHP, Javascrit, mySQL
Decomposition in elementary units of the secondary sources
Decomposition in elementary units of cadasters
Decomposition in elementary units of image banks
Newspaper mining
The goal is to find all the sentences in a large newspaper archive that contains at least 2 names entities. These sentences should be posted as pulses.
The named entity detection have already been done. The only challenge to retrieve the corresponding sentences in the digitized transcriptions.
Resp. Maud
Skills: Python or Java