Rolandi Librettos: Difference between revisions
Jump to navigation
Jump to search
Line 86: | Line 86: | ||
| Extracting corresponding information for the MediaWiki API for entities (theatres etc.) | | Extracting corresponding information for the MediaWiki API for entities (theatres etc.) | ||
|- | |- | ||
| colspan=" | | colspan="3" align="center" | '''''Week 11''''' | ||
|- | |- | ||
| rowspan="2" | ''25.11'' | | rowspan="2" | ''25.11'' | ||
Line 94: | Line 94: | ||
| Linking similar entities together (which directors performed the same play in different cities?) | | Linking similar entities together (which directors performed the same play in different cities?) | ||
|- | |- | ||
| colspan=" | | colspan="3" align="center" | '''''Week 12''''' | ||
|- | |- | ||
| rowspan="3" | ''02.12'' | | rowspan="3" | ''02.12'' | ||
Line 104: | Line 104: | ||
| Continuously working on the report and the presentation | | Continuously working on the report and the presentation | ||
|- | |- | ||
| colspan=" | | colspan="3" align="center" | '''''Week 13''''' | ||
|- | |- | ||
| rowspan="1" | ''09.12'' | | rowspan="1" | ''09.12'' |
Revision as of 13:12, 19 November 2020
Introduction
Wiki page of Group 10 on Rolandi Librettos
Project Planning
The draft of the project and the tasks for each week are assigned below:
Timeframe | Task | Completion |
---|---|---|
Week 4 | ||
07.10 | Evaluating which APIs to use (IIIF) | ✅ |
Write a scraper to scrape IIIF manifests from the Libretto website | ||
Week 5 | ||
14.10 | Processing of images: apply Tessaract OCR | ✅ |
Extraction of dates and cleaned the dataset to create initial DataFrame | ||
Week 6 | ||
21.10 | Design and develop initial structure for the visualization (using dates data) | ✅ |
Running a sanity check on the initial DataFrame by hand | ||
Matching list of cities extracted from OCR using search techniques | ||
Week 7 | ||
28.10 | Remove irrelevant backgrounds of images | ✅ |
Extract age and gender from images | ||
Design data model | ||
Extract tags, names, birth and death years out of metadata | ||
Week 8 | ||
04.11 | Get coordinates for each city and translation of city names | ✅ |
Extracted additional metadata (opera title, maestro) from the title of Libretto | ||
Setting up map and slider in the visualization and order by year | ||
Week 9 | ||
11.11 | Adding metadata information in visualization by having information pane | ✅ |
Checking in with the Cini Foundation | ||
Preparing the Wiki outline and the midterm presentation | ||
Week 10 | ||
18.11 | Compiling a list of musical theatres | ⬜️ |
Getting better recall and precision on the city information | ||
Identifying composers and getting a performer's information | ||
Extracting corresponding information for the MediaWiki API for entities (theatres etc.) | ||
Week 11 | ||
25.11 | Integrate visualization's zoom functionality with the data pipeline to see intra-level info | ⬜️ |
Linking similar entities together (which directors performed the same play in different cities?) | ||
Week 12 | ||
02.12 | Serving the website and do performance metrics for our data analysis | ⬜️ |
Communicate and get feedback from the Cini Foundation | ||
Continuously working on the report and the presentation | ||
Week 13 | ||
09.12 | Finishing off the project website and work, do a presentation on our results | ⬜️ |