Projects: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
Line 13: Line 13:


= Decomposition in elementary units of image banks =
= Decomposition in elementary units of image banks =
The goal is to transform the metadata of CINI which have been OCRed into pulses.
One challenge is to deal with OCR errors and possible disambiguation.


= Newspaper mining =
= Newspaper mining =

Revision as of 12:42, 28 September 2017

ClioWire platform

On the basis on existing codes, develop API for the other groups (posting pulses, searching pulses) Develop bots to rewrite pulses based on other sources.

Initial code base may be chosen among the following options : Social, Mastodon, etc.

Knowledge required : Python, PHP, Javascrit, mySQL

Decomposition in elementary units of the secondary sources

Decomposition in elementary units of cadasters

Decomposition in elementary units of image banks

The goal is to transform the metadata of CINI which have been OCRed into pulses. One challenge is to deal with OCR errors and possible disambiguation.

Newspaper mining

The goal is to find all the sentences in a large newspaper archive that contains at least 2 names entities. These sentences should be posted as pulses.

The named entity detection have already been done. The only challenge to retrieve the corresponding sentences in the digitized transcriptions.

Resp. Maud

Skills: Python or Java