Rolandi Librettos: Difference between revisions
Jump to navigation
Jump to search
| Line 86: | Line 86: | ||
| Extracting corresponding information for the MediaWiki API for entities (theatres etc.) | | Extracting corresponding information for the MediaWiki API for entities (theatres etc.) | ||
|- | |- | ||
| colspan=" | | colspan="3" align="center" | '''''Week 11''''' | ||
|- | |- | ||
| rowspan="2" | ''25.11'' | | rowspan="2" | ''25.11'' | ||
| Line 94: | Line 94: | ||
| Linking similar entities together (which directors performed the same play in different cities?) | | Linking similar entities together (which directors performed the same play in different cities?) | ||
|- | |- | ||
| colspan=" | | colspan="3" align="center" | '''''Week 12''''' | ||
|- | |- | ||
| rowspan="3" | ''02.12'' | | rowspan="3" | ''02.12'' | ||
| Line 104: | Line 104: | ||
| Continuously working on the report and the presentation | | Continuously working on the report and the presentation | ||
|- | |- | ||
| colspan=" | | colspan="3" align="center" | '''''Week 13''''' | ||
|- | |- | ||
| rowspan="1" | ''09.12'' | | rowspan="1" | ''09.12'' | ||
Revision as of 13:12, 19 November 2020
Introduction
Wiki page of Group 10 on Rolandi Librettos
Project Planning
The draft of the project and the tasks for each week are assigned below:
| Timeframe | Task | Completion |
|---|---|---|
| Week 4 | ||
| 07.10 | Evaluating which APIs to use (IIIF) | ✅ |
| Write a scraper to scrape IIIF manifests from the Libretto website | ||
| Week 5 | ||
| 14.10 | Processing of images: apply Tessaract OCR | ✅ |
| Extraction of dates and cleaned the dataset to create initial DataFrame | ||
| Week 6 | ||
| 21.10 | Design and develop initial structure for the visualization (using dates data) | ✅ |
| Running a sanity check on the initial DataFrame by hand | ||
| Matching list of cities extracted from OCR using search techniques | ||
| Week 7 | ||
| 28.10 | Remove irrelevant backgrounds of images | ✅ |
| Extract age and gender from images | ||
| Design data model | ||
| Extract tags, names, birth and death years out of metadata | ||
| Week 8 | ||
| 04.11 | Get coordinates for each city and translation of city names | ✅ |
| Extracted additional metadata (opera title, maestro) from the title of Libretto | ||
| Setting up map and slider in the visualization and order by year | ||
| Week 9 | ||
| 11.11 | Adding metadata information in visualization by having information pane | ✅ |
| Checking in with the Cini Foundation | ||
| Preparing the Wiki outline and the midterm presentation | ||
| Week 10 | ||
| 18.11 | Compiling a list of musical theatres | ⬜️ |
| Getting better recall and precision on the city information | ||
| Identifying composers and getting a performer's information | ||
| Extracting corresponding information for the MediaWiki API for entities (theatres etc.) | ||
| Week 11 | ||
| 25.11 | Integrate visualization's zoom functionality with the data pipeline to see intra-level info | ⬜️ |
| Linking similar entities together (which directors performed the same play in different cities?) | ||
| Week 12 | ||
| 02.12 | Serving the website and do performance metrics for our data analysis | ⬜️ |
| Communicate and get feedback from the Cini Foundation | ||
| Continuously working on the report and the presentation | ||
| Week 13 | ||
| 09.12 | Finishing off the project website and work, do a presentation on our results | ⬜️ |