Europeana: A New Spatiotemporal Search Engine: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
Line 50: Line 50:
|By Week 9
|By Week 9
|
|
* Preprocess the data. (OCR, grammar checker)
* Preprocess the data. (OCR or grammar checker)
*  
* Prototype and database design.
| align="center" |  
| align="center" |  
|-
|-
Line 64: Line 64:
|By Week 11
|By Week 11
|
|
* To be filled
* Content analysis.
| align="center" |
| align="center" |
|-
|-
Line 77: Line 77:
|By Week 13
|By Week 13
|
|
* To be filled
* Build the web.
| align="center" |
| align="center" |
|-
|-
Line 83: Line 83:
|By Week 14
|By Week 14
|
|
* To be filled
* Final report.


| align="center" |
| align="center" |
|-
|-


|By Week 15
 
|
* To be filled
| align="center" |
|-
|}
|}



Revision as of 10:05, 10 November 2022

Introduction

Motivation

Project Plan and Milestones

Date Task Completion
By Week 3
  • Brainstorm projects ideas.
  • Prepare slides for initial project idea presentation.
By Week 5
  • Discuss the differences between image analysis and text analysis in terms of related algorithms, processing toolkits, implementation difficulties and display methods.
  • Decide to focus on text processing.
  • Select a subset collection from the "Newspaper collection" of Europeana for our project.
  • Check the content of "La clef du cabinet des princes de l'Europe" and roughly select 3 topics we may focus on.
By Week 6
  • Each of us read some pages of the journal to get an overall understanding of it.
  • We find that the accuracy of the OCR results isn't very satisfying and decide to somehow improve the OCR results before text analyzing.
  • Request for data.
By Week 7
  • Research in OCR methods and find some OCR methods for Italian italics
  • Get text by web analysis
  • Use DeepL to translate FR to ENG, and then translate ENG to FR, finally check results
  • Reproduce the OCR method from the literature and find that recognition has improved.
By Week 8
  • To be filled
By Week 9
  • Preprocess the data. (OCR or grammar checker)
  • Prototype and database design.
By Week 10
  • To be filled
By Week 11
  • Content analysis.
By Week 12
  • To be filled
By Week 13
  • Build the web.
By Week 14
  • Final report.

Github Repository

https://github.com/XinyiDyee/Europeana-Search-Engine

Reference