Humans of Paris 1900: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 19: Line 19:
To make this vision reality, we went through 4 stages in the processing, namely fetching the data, processing it, storing and serving data and front end development.
To make this vision reality, we went through 4 stages in the processing, namely fetching the data, processing it, storing and serving data and front end development.


[https://gallica.bnf.fr/ark:/12148/bpt6k124915s Guide des plaisirs à Paris]
The fetching can easily be done using the wrapper provided by Raphaël Barman.
Preprocessing involves processing of the metadata to extract tags as well as processing of the images to obtain a vector encoding of an individual's face, extracting [https://pypi.org/project/py-agender/ age and gender] and cropping it.
Storing the data involves [https://www.sqlite.org/index.html Sqlite] and Django ORM [https://docs.djangoproject.com/en/2.2/topics/db/ Django ORM]
Serving the data:
Front end development will use Bootstrap and D3.js for displaying the FaceMap.
 
We organized the tasks according to the following schedule and milestones.
 





Revision as of 14:41, 20 November 2019

Project plan

The final goal is to present the user with a way to navigate the collection and understand who the people is the photos are and what life they lived. We develop a website that provides the user with 4 core functionalities.

  1. An overview of the most famous individuals in the collection
  2. A search page to explore individuals based on tag matching
  3. A cluster view of individuals in the database, based on their looks
  4. A way to upload ones image and find ones’ doppelgänger in the collection

To make this vision reality, we went through 4 stages in the processing, namely fetching the data, processing it, storing and serving data and front end development.

The fetching can easily be done using the wrapper provided by Raphaël Barman. Preprocessing involves processing of the metadata to extract tags as well as processing of the images to obtain a vector encoding of an individual's face, extracting age and gender and cropping it. Storing the data involves Sqlite and Django ORM Django ORM Serving the data: Front end development will use Bootstrap and D3.js for displaying the FaceMap.

We organized the tasks according to the following schedule and milestones.


Milestones

Weekly working plan
Timeframe Task Completion
Week 4
07.11 Understanding Gallica Query Gallica API
Query Gallica API
Week 5
14.10 Start preprocessing images
Choose suitable Wikipedia API
Week 6
21.10 Choose face recognition library
Get facial vectors
Try database design with Docker & Flask
Week 7
28.10 Remove irrelevant backgrounds of images
Extract age and gender from images
Design data model
Extract tags, names, birth and death years out of metadata
Week 8
04.11 Set up database environment
Set up mockup user-interface
Prepare midterm presentation
Week 9
11.11 Get tags, names, birth and death years in ready-to-use format
Handle Wikipedia false positives
Integrate face recognition functionalities into database
Week 10
18.11 Create draft of the website (frontend)
Create FaceMap using D3
Week 11
25.11 Integrate all functionalities
Finalize project website
Week 12
02.12 Write Project report