Revision as of 07:47, 4 October 2017

ClioWire: Platform management and development

This group will manage the experimental platform of the course. They will have to run platform and develop additional features for processing and presenting the pulses. The initial code base is Mastodon.

The group will write bots for rewritting pulses and progressively converging towards articulation/datafication of the pulses.

Knowledge required : Python, Javascript, basic linux administration.

Resp. Vincent and Orlin

Extracting information from secondary sources

The goal is to extract from a collection of 3000 scanned books about Venice all the sentences containing at least two named entities and transforming them into pulses. This should consiste a de facto set of relevant information taking a large base of Venetian documents.

Resp. Giovanni

Decomposition in elementary units of primary sources

This group will look for named entities in digiitized manuscript and post pulses about these mentions. The group will use Wordspotting methods

Supervisor : Sofia

Skills : Java

Decomposition in elementary units of image banks

The goal is to transform the metadata of CINI which have been OCRed into pulses. One challenge is to deal with OCR errors and possible disambiguation.

Supervision: Lia

Newspaper, WIkipedia, Semantic Web mining

The goal is to find all the sentences in a large newspaper archive that contains at least 2 names entities. These sentences should be posted as pulses.

The named entity detection have already been done. The only challenge to retrieve the corresponding sentences in the digitized transcriptions.

In addition, this group should look for ways for importing massively element of knowledge from other sources (DBPedia, RDF databases)

Resp. Maud

Skills: Python or Java

@@ Line 1: / Line 1: @@
-= Platform management and development: ClioWire platform =
+= ClioWire: Platform management and development =
-Installation and customization of the platform.
+This group will manage the experimental platform of the course. They will have to run platform and develop additional features for processing and presenting the pulses.
-Simple API.
+The initial code base is Mastodon.
-Script for searching, creating, inference
-Bot for rewritting pulses.
-Initial code base  will be Mastodon.
+The group will write bots for rewritting pulses and progressively converging towards articulation/datafication of the pulses.
 Knowledge required : Python, Javascript, basic linux administration.
@@ Line 12: / Line 10: @@
 Resp. Vincent and Orlin
-= Decomposition in elementary units of the secondary sources =
+= Extracting information from secondary sources =
-The goal is to extract from a collection of 3000 scanned books all the sentences containing at least two named entities and transforming them into pulses.
+The goal is to extract from a collection of 3000 scanned books about Venice all the sentences containing at least two named entities and transforming them into pulses.
+This should consiste a de facto set of relevant information taking a large base of Venetian documents.
 Resp. Giovanni
@@ Line 38: / Line 37: @@
 The goal is to find all the sentences in a large newspaper archive that contains at least 2 names entities.
 These sentences should be posted as pulses.
 The named entity detection have already been done. The only challenge to retrieve the corresponding sentences in the digitized transcriptions.

Projects: Difference between revisions

Revision as of 07:47, 4 October 2017

Contents

ClioWire: Platform management and development

Extracting information from secondary sources

Decomposition in elementary units of primary sources

Decomposition in elementary units of image banks

Newspaper, WIkipedia, Semantic Web mining

Navigation menu

Projects: Difference between revisions

Revision as of 07:47, 4 October 2017

ClioWire: Platform management and development

Extracting information from secondary sources

Decomposition in elementary units of primary sources

Decomposition in elementary units of image banks

Newspaper, WIkipedia, Semantic Web mining

Navigation menu

Search