Rolandi Librettos: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 5: Line 5:




=Planning=
==Project Planning==


{|class="wikitable"
The draft of the project and the tasks for each week are assigned below:
! style="text-align:center;"|Step
 
! Things to do
{| class="wikitable" style="margin:auto; margin:auto;"
! Week
|+ Weekly working plan
|-
! Timeframe
! Task
! Completion
|-
| colspan="3" align="center" | '''''Week 4'''''
|-
| rowspan="2" | ''07.10''
| Evaluating which APIs to use (IIIF)
| rowspan="2" align="center" | ✓
|-
| Write a scraper to scrape IIIF manifests from the Libretto website
|-
| colspan="3" align="center" | '''''Week 5'''''
|-
| rowspan="2" | ''14.10''
| Processing of images: apply Tessaract OCR
| rowspan="2" align="center" |
|-
| Extraction of dates and cleaned the dataset to create initial DataFrame
|-
| colspan="3" align="center" | '''''Week 6'''''
|-
| rowspan="3" | ''21.10''
| Design and develop initial structure for the visualization (using dates data)
| rowspan="3" align="center" | ✓
|-
| Running a sanity check on the initial DataFrame by hand
|-
| Matching list of cities extracted from OCR using search techniques
|-
| colspan="3" align="center" | '''''Week 7'''''
|-
| rowspan="4" | ''28.10''
| Remove irrelevant backgrounds of images
| rowspan="4" align="center" | ✓
|-
| Extract age and gender from images
|-
| Design data model
|-
| Extract tags, names, birth and death years out of metadata
|-
| colspan="3" align="center" | '''''Week 8'''''
|-
|-
|1
| rowspan="3" | ''04.11''
|  
| Get coordinates for each city and translation of city names
|
| rowspan="3" align="center" | ✓
|-
|-
|2
| Extracted additional metadata (opera title, maestro) from the title of Libretto
|
|
|-
|-
|3
| Setting up map and slider in the visualization and order by year
|
|
|-
|-
|4
| colspan="3" align="center" | '''''Week 9'''''
|
|
|-
|-
|5
| rowspan="3" | ''11.11''
|
| Adding metadata information in visualization by having information pane
|
| rowspan="3" align="center" | ✓
|-
|-
| Checking in with the Cini Foundation
|-
| Preparing the Wiki outline and the midterm presentation
|-
| colspan="3" align="center" | '''''Week 10'''''
|-
| rowspan="4" | ''18.11''
| Compiling a list of musical theatres and visualize them
| rowspan="4" align="center" | ✓
|-
| Getting better recall and precision on the city information
|-
| Identifying composers and getting a performer's information
|-
| Extracting corresponding information for the MediaWiki API for entities (theatres etc.)
|-
| colspan="2" align="center" | '''''Week 11'''''
|-
| rowspan="3" | ''25.11''
| Integrate visualization's zoom functionality with the data pipeline to see intra-level info
| rowspan="2" align="center" | ✓
|-
| Linking similar entities together (which directors performed the same play in different cities?)
|-
| colspan="2" align="center" | '''''Week 12'''''
|-
| rowspan="3" | ''02.12''
| Serving the website and do performance metrics for our data analysis
| rowspan="3" align="center" | ✓
|-
| Communicate and get feedback from the Cini Foundation
|-
| Continuously working on the report and the presentation
|-
| colspan="2" align="center" | '''''Week 13'''''
|-
| ''09.12''
| Finishing off the project website and work, do a presentation on our results
| rowspan="2" align="center" | ✓
|}
|}


[[File:Fdh.png|350px|right|thumb| Just to show how to add images ]]
[[File:Fdh.png|350px|right|thumb| Just to show how to add images ]]

Revision as of 13:00, 19 November 2020

Introduction

Wiki page of Group 10 on Rolandi Librettos


Project Planning

The draft of the project and the tasks for each week are assigned below:

Weekly working plan
Timeframe Task Completion
Week 4
07.10 Evaluating which APIs to use (IIIF)
Write a scraper to scrape IIIF manifests from the Libretto website
Week 5
14.10 Processing of images: apply Tessaract OCR
Extraction of dates and cleaned the dataset to create initial DataFrame
Week 6
21.10 Design and develop initial structure for the visualization (using dates data)
Running a sanity check on the initial DataFrame by hand
Matching list of cities extracted from OCR using search techniques
Week 7
28.10 Remove irrelevant backgrounds of images
Extract age and gender from images
Design data model
Extract tags, names, birth and death years out of metadata
Week 8
04.11 Get coordinates for each city and translation of city names
Extracted additional metadata (opera title, maestro) from the title of Libretto
Setting up map and slider in the visualization and order by year
Week 9
11.11 Adding metadata information in visualization by having information pane
Checking in with the Cini Foundation
Preparing the Wiki outline and the midterm presentation
Week 10
18.11 Compiling a list of musical theatres and visualize them
Getting better recall and precision on the city information
Identifying composers and getting a performer's information
Extracting corresponding information for the MediaWiki API for entities (theatres etc.)
Week 11
25.11 Integrate visualization's zoom functionality with the data pipeline to see intra-level info
Linking similar entities together (which directors performed the same play in different cities?)
Week 12
02.12 Serving the website and do performance metrics for our data analysis
Communicate and get feedback from the Cini Foundation
Continuously working on the report and the presentation
Week 13
09.12 Finishing off the project website and work, do a presentation on our results


Just to show how to add images
Just to show how to add images

Historical Source

Methodology

Collecting data

Metadata extraction

Visualization

Quality assessment

Overall pipeline

Basic features extraction

Efficiency of algorithms

Results

Website

Links