Projects: Difference between revisions
Line 1: | Line 1: | ||
= Platform management and development: ClioWire platform = | = Platform management and development: ClioWire platform = | ||
Installation and customization of the platform. | |||
Simple API. | |||
Script for searching, creating, inference | |||
Bot for rewritting pulses. | |||
Initial code base will be Mastodon. | Initial code base will be Mastodon. |
Revision as of 13:15, 28 September 2017
Platform management and development: ClioWire platform
Installation and customization of the platform. Simple API. Script for searching, creating, inference Bot for rewritting pulses.
Initial code base will be Mastodon.
Knowledge required : Python, Javascript, basic linux administration.
Resp. Vincent and Orlin
Decomposition in elementary units of the secondary sources
The goal is to extract from a collection of 3000 scanned books all the sentences containing at least two named entities and transforming them into pulses.
Resp. Giovanni
Decomposition in elementary units of primary sources
This group will look for named entities in digiitized manuscript and post pulses about these mentions. The group will use Wordspotting methods
Supervisor : Sofia
Skills : Java
Decomposition in elementary units of image banks
The goal is to transform the metadata of CINI which have been OCRed into pulses. One challenge is to deal with OCR errors and possible disambiguation.
Supervision: Lia
Newspaper, WIkipedia, Semantic Web mining
The goal is to find all the sentences in a large newspaper archive that contains at least 2 names entities. These sentences should be posted as pulses.
The named entity detection have already been done. The only challenge to retrieve the corresponding sentences in the digitized transcriptions.
In addition, this group should look for ways for importing massively element of knowledge from other sources (DBPedia, RDF databases)
Resp. Maud
Skills: Python or Java