Deciphering Venetian handwriting

From FDHwiki
Jump to navigation Jump to search

Introduction

Planning

Week Task
09 Segment patch of text in Sommarioni : (page id, patch)
10 Mapping transcription (excel file) -> page id (proof of concept)
11 Mapping transcription (excel file) -> page id (on the whole dataset)
12 Depending of the quality of the results : improve the mapping of page id, more precise matching, viewer web
13 Final results, final evaluation & final report writing
14 Final project presentation

Week 09

  • Input  : Sommarioni images
  • Output : Patch of pixels containing text with coordinate of the patch in the Sommarioni
  • Step 1 : Segment hand written text regions in Sommarioni images
  • Step 2 : Extraction of the patches

Week 10

  • Input  : transcription (Excel File), tuples (page id, patch) extracted in week 9
  • Output : line in the transcription -> page id
  • Step 1 : HTR recognition in the patch and cleaning  : (patch, text)
  • Step 2 : Find matching pair between recognized text and transcription
  • Step 3 : New excel file with the new page id column

Week 11

  • Step 1 : Apply the pipeline validated on week 10 on the whole dataset
  • Step 2 : Evaluate the quality and based on that decide of the tasks for the next weeks

Week 12

  • Depending of the quality of the matching
 * Improve image segmentation
 * More precise matching (excel cell) -> (page id, patch) in order to have the precise box of each written text
 * Use a IIF image viewer to show the results of the project in a more fancy way

Historical introduction to the source

Methodology

Quality assessment

Links