Rolandi Librettos: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 5: | Line 5: | ||
=Planning= | ==Project Planning== | ||
{|class="wikitable" | The draft of the project and the tasks for each week are assigned below: | ||
! | |||
{| class="wikitable" style="margin:auto; margin:auto;" | |||
|+ Weekly working plan | |||
|- | |||
! Timeframe | |||
! Task | |||
! Completion | |||
|- | |||
| colspan="3" align="center" | '''''Week 4''''' | |||
|- | |||
| rowspan="2" | ''07.10'' | |||
| Evaluating which APIs to use (IIIF) | |||
| rowspan="2" align="center" | ✓ | |||
|- | |||
| Write a scraper to scrape IIIF manifests from the Libretto website | |||
|- | |||
| colspan="3" align="center" | '''''Week 5''''' | |||
|- | |||
| rowspan="2" | ''14.10'' | |||
| Processing of images: apply Tessaract OCR | |||
| rowspan="2" align="center" | ✓ | |||
|- | |||
| Extraction of dates and cleaned the dataset to create initial DataFrame | |||
|- | |||
| colspan="3" align="center" | '''''Week 6''''' | |||
|- | |||
| rowspan="3" | ''21.10'' | |||
| Design and develop initial structure for the visualization (using dates data) | |||
| rowspan="3" align="center" | ✓ | |||
|- | |||
| Running a sanity check on the initial DataFrame by hand | |||
|- | |||
| Matching list of cities extracted from OCR using search techniques | |||
|- | |||
| colspan="3" align="center" | '''''Week 7''''' | |||
|- | |||
| rowspan="4" | ''28.10'' | |||
| Remove irrelevant backgrounds of images | |||
| rowspan="4" align="center" | ✓ | |||
|- | |||
| Extract age and gender from images | |||
|- | |||
| Design data model | |||
|- | |||
| Extract tags, names, birth and death years out of metadata | |||
|- | |||
| colspan="3" align="center" | '''''Week 8''''' | |||
|- | |- | ||
| | | rowspan="3" | ''04.11'' | ||
| | | Get coordinates for each city and translation of city names | ||
| | | rowspan="3" align="center" | ✓ | ||
|- | |- | ||
| | | Extracted additional metadata (opera title, maestro) from the title of Libretto | ||
|- | |- | ||
| | | Setting up map and slider in the visualization and order by year | ||
|- | |- | ||
| | | colspan="3" align="center" | '''''Week 9''''' | ||
| | |||
|- | |- | ||
| | | rowspan="3" | ''11.11'' | ||
| | | Adding metadata information in visualization by having information pane | ||
| | | rowspan="3" align="center" | ✓ | ||
|- | |- | ||
| Checking in with the Cini Foundation | |||
|- | |||
| Preparing the Wiki outline and the midterm presentation | |||
|- | |||
| colspan="3" align="center" | '''''Week 10''''' | |||
|- | |||
| rowspan="4" | ''18.11'' | |||
| Compiling a list of musical theatres and visualize them | |||
| rowspan="4" align="center" | ✓ | |||
|- | |||
| Getting better recall and precision on the city information | |||
|- | |||
| Identifying composers and getting a performer's information | |||
|- | |||
| Extracting corresponding information for the MediaWiki API for entities (theatres etc.) | |||
|- | |||
| colspan="2" align="center" | '''''Week 11''''' | |||
|- | |||
| rowspan="3" | ''25.11'' | |||
| Integrate visualization's zoom functionality with the data pipeline to see intra-level info | |||
| rowspan="2" align="center" | ✓ | |||
|- | |||
| Linking similar entities together (which directors performed the same play in different cities?) | |||
|- | |||
| colspan="2" align="center" | '''''Week 12''''' | |||
|- | |||
| rowspan="3" | ''02.12'' | |||
| Serving the website and do performance metrics for our data analysis | |||
| rowspan="3" align="center" | ✓ | |||
|- | |||
| Communicate and get feedback from the Cini Foundation | |||
|- | |||
| Continuously working on the report and the presentation | |||
|- | |||
| colspan="2" align="center" | '''''Week 13''''' | |||
|- | |||
| ''09.12'' | |||
| Finishing off the project website and work, do a presentation on our results | |||
| rowspan="2" align="center" | ✓ | |||
|} | |} | ||
[[File:Fdh.png|350px|right|thumb| Just to show how to add images ]] | [[File:Fdh.png|350px|right|thumb| Just to show how to add images ]] |
Revision as of 13:00, 19 November 2020
Introduction
Wiki page of Group 10 on Rolandi Librettos
Project Planning
The draft of the project and the tasks for each week are assigned below:
Timeframe | Task | Completion |
---|---|---|
Week 4 | ||
07.10 | Evaluating which APIs to use (IIIF) | ✓ |
Write a scraper to scrape IIIF manifests from the Libretto website | ||
Week 5 | ||
14.10 | Processing of images: apply Tessaract OCR | ✓ |
Extraction of dates and cleaned the dataset to create initial DataFrame | ||
Week 6 | ||
21.10 | Design and develop initial structure for the visualization (using dates data) | ✓ |
Running a sanity check on the initial DataFrame by hand | ||
Matching list of cities extracted from OCR using search techniques | ||
Week 7 | ||
28.10 | Remove irrelevant backgrounds of images | ✓ |
Extract age and gender from images | ||
Design data model | ||
Extract tags, names, birth and death years out of metadata | ||
Week 8 | ||
04.11 | Get coordinates for each city and translation of city names | ✓ |
Extracted additional metadata (opera title, maestro) from the title of Libretto | ||
Setting up map and slider in the visualization and order by year | ||
Week 9 | ||
11.11 | Adding metadata information in visualization by having information pane | ✓ |
Checking in with the Cini Foundation | ||
Preparing the Wiki outline and the midterm presentation | ||
Week 10 | ||
18.11 | Compiling a list of musical theatres and visualize them | ✓ |
Getting better recall and precision on the city information | ||
Identifying composers and getting a performer's information | ||
Extracting corresponding information for the MediaWiki API for entities (theatres etc.) | ||
Week 11 | ||
25.11 | Integrate visualization's zoom functionality with the data pipeline to see intra-level info | ✓ |
Linking similar entities together (which directors performed the same play in different cities?) | ||
Week 12 | ||
02.12 | Serving the website and do performance metrics for our data analysis | ✓ |
Communicate and get feedback from the Cini Foundation | ||
Continuously working on the report and the presentation | ||
Week 13 | ||
09.12 | Finishing off the project website and work, do a presentation on our results | ✓ |