Rolandi Librettos: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
No edit summary
Line 20: Line 20:
| rowspan="2" | ''07.10''
| rowspan="2" | ''07.10''
| Evaluating which APIs to use (IIIF)
| Evaluating which APIs to use (IIIF)
| rowspan="2" align="center" |
| rowspan="2" align="center" |
|-
|-
| Write a scraper to scrape IIIF manifests from the Libretto website
| Write a scraper to scrape IIIF manifests from the Libretto website
Line 28: Line 28:
| rowspan="2" | ''14.10''
| rowspan="2" | ''14.10''
| Processing of images: apply Tessaract OCR
| Processing of images: apply Tessaract OCR
| rowspan="2" align="center" |
| rowspan="2" align="center" |
|-
|-
| Extraction of dates and cleaned the dataset to create initial DataFrame
| Extraction of dates and cleaned the dataset to create initial DataFrame
Line 36: Line 36:
| rowspan="3" | ''21.10''
| rowspan="3" | ''21.10''
| Design and develop initial structure for the visualization (using dates data)
| Design and develop initial structure for the visualization (using dates data)
| rowspan="3" align="center" |
| rowspan="3" align="center" |
|-
|-
| Running a sanity check on the initial DataFrame by hand
| Running a sanity check on the initial DataFrame by hand
Line 46: Line 46:
| rowspan="4" | ''28.10''
| rowspan="4" | ''28.10''
| Remove irrelevant backgrounds of images
| Remove irrelevant backgrounds of images
| rowspan="4" align="center" |
| rowspan="4" align="center" |
|-
|-
| Extract age and gender from images
| Extract age and gender from images
Line 58: Line 58:
| rowspan="3" | ''04.11''
| rowspan="3" | ''04.11''
| Get coordinates for each city and translation of city names
| Get coordinates for each city and translation of city names
| rowspan="3" align="center" |
| rowspan="3" align="center" |
|-
|-
| Extracted additional metadata (opera title, maestro) from the title of Libretto
| Extracted additional metadata (opera title, maestro) from the title of Libretto
Line 68: Line 68:
| rowspan="3" | ''11.11''
| rowspan="3" | ''11.11''
| Adding metadata information in visualization by having information pane
| Adding metadata information in visualization by having information pane
| rowspan="3" align="center" |
| rowspan="3" align="center" |
|-
|-
| Checking in with the Cini Foundation
| Checking in with the Cini Foundation
Line 78: Line 78:
| rowspan="4" | ''18.11''
| rowspan="4" | ''18.11''
| Compiling a list of musical theatres and visualize them
| Compiling a list of musical theatres and visualize them
| rowspan="4" align="center" |
| rowspan="4" align="center" | ⬜️
|-
|-
| Getting better recall and precision on the city information
| Getting better recall and precision on the city information
Line 90: Line 90:
| rowspan="3" | ''25.11''
| rowspan="3" | ''25.11''
| Integrate visualization's zoom functionality with the data pipeline to see intra-level info
| Integrate visualization's zoom functionality with the data pipeline to see intra-level info
| rowspan="2" align="center" |
| rowspan="2" align="center" | ⬜️
|-
|-
| Linking similar entities together (which directors performed the same play in different cities?)
| Linking similar entities together (which directors performed the same play in different cities?)
Line 98: Line 98:
| rowspan="3" | ''02.12''
| rowspan="3" | ''02.12''
| Serving the website and do performance metrics for our data analysis
| Serving the website and do performance metrics for our data analysis
| rowspan="3" align="center" |
| rowspan="3" align="center" | ⬜️
|-
|-
| Communicate and get feedback from the Cini Foundation
| Communicate and get feedback from the Cini Foundation
Line 108: Line 108:
| ''09.12''
| ''09.12''
| Finishing off the project website and work, do a presentation on our results
| Finishing off the project website and work, do a presentation on our results
| rowspan="2" align="center" |
| rowspan="2" align="center" | ⬜️
|}
|}


Line 114: Line 114:
[[File:Fdh.png|350px|right|thumb| Just to show how to add images ]]
[[File:Fdh.png|350px|right|thumb| Just to show how to add images ]]
[[File:Fdh1.png|350px|right|thumb| Just to show how to add images]]
[[File:Fdh1.png|350px|right|thumb| Just to show how to add images]]
=Historical Source=
=Historical Source=



Revision as of 13:03, 19 November 2020

Introduction

Wiki page of Group 10 on Rolandi Librettos


Project Planning

The draft of the project and the tasks for each week are assigned below:

Weekly working plan
Timeframe Task Completion
Week 4
07.10 Evaluating which APIs to use (IIIF)
Write a scraper to scrape IIIF manifests from the Libretto website
Week 5
14.10 Processing of images: apply Tessaract OCR
Extraction of dates and cleaned the dataset to create initial DataFrame
Week 6
21.10 Design and develop initial structure for the visualization (using dates data)
Running a sanity check on the initial DataFrame by hand
Matching list of cities extracted from OCR using search techniques
Week 7
28.10 Remove irrelevant backgrounds of images
Extract age and gender from images
Design data model
Extract tags, names, birth and death years out of metadata
Week 8
04.11 Get coordinates for each city and translation of city names
Extracted additional metadata (opera title, maestro) from the title of Libretto
Setting up map and slider in the visualization and order by year
Week 9
11.11 Adding metadata information in visualization by having information pane
Checking in with the Cini Foundation
Preparing the Wiki outline and the midterm presentation
Week 10
18.11 Compiling a list of musical theatres and visualize them ⬜️
Getting better recall and precision on the city information
Identifying composers and getting a performer's information
Extracting corresponding information for the MediaWiki API for entities (theatres etc.)
Week 11
25.11 Integrate visualization's zoom functionality with the data pipeline to see intra-level info ⬜️
Linking similar entities together (which directors performed the same play in different cities?)
Week 12
02.12 Serving the website and do performance metrics for our data analysis ⬜️
Communicate and get feedback from the Cini Foundation
Continuously working on the report and the presentation
Week 13
09.12 Finishing off the project website and work, do a presentation on our results ⬜️


Just to show how to add images
Just to show how to add images

Historical Source

Methodology

Collecting data

Metadata extraction

Visualization

Quality assessment

Overall pipeline

Basic features extraction

Efficiency of algorithms

Results

Website

Links