Project Timeline
Timeframe
|
Task
|
Completion
|
Week 4
|
- Explore postcard search results on Europeana's website
- Study the Europeana API documentation and get an access key.
- Extract data of postcards using the Europeana API
|
✅
|
Week 5
|
- Clean data using metadata.
- Analyze the data of Europeana postcards
- Prepare sample image sets and explore prediction methods
|
✅
|
Week 6
|
- Decide to focus on postcards with text
- Test and evaluate the effectiveness of multiple OCR models
|
✅
|
Week 7
|
- Use OCR and NER for prediction
- Test and evaluate the effectiveness of multiple NER tools
- Explore alternative forecasting methods
|
✅
|
Week 8
|
- Introduce ChatGPT for the prediction(OCR+GPT-3.5+NER)
- Try to make predictions directly using GPT-4
|
✅
|
Week 9
|
- Optimize GPT-3.5 prompt for better results
- Compare the results of OCR + GPT-3.5 (optimized prompts) to those of GPT-4.
|
✅
|
Week 10
|
- Complete the pipeline for the entire prediction process
- Prepare a sample set to evaluate the effect
|
✅
|
Week 11
|
- Explore the visualization methods
- Refine the test set and analyze it
|
✅
|
Week 12
|
- Use the TA's annotation tool for test set evaluation
- Build the visualization platform
|
|
Week 13
|
- Testing and refinement of the Web application
- Analyze the results of the test set evaluation
|
|
Week 14
|
- Prepare the final report and presentation
|
|
Github Repository
Europeana-mapping-postcards
References