France: Exploring Historical Cookbooks: Difference between revisions
Xiaotian.su (talk | contribs) (→Links) |
Xiaotian.su (talk | contribs) |
||
Line 14: | Line 14: | ||
{|class="wikitable" | {|class="wikitable" | ||
!scope="col" width="100"|Date | !scope="col" width="100"|Date | ||
!| | !|Tasks | ||
! |Completion | ! |Completion | ||
|- | |- | ||
| | |||
|Week 3 | |||
| | |||
* Find multiple French cookbood in French or English from different times. | |||
* Prepare slides for the initial project idea presentation. | |||
| align="center" | ✓ | |||
|- | |||
|Week 4 | |||
| | |||
* Compare different cookbooks, consider the OCR scan and think of possible research questions. | |||
* Discuss with TAs the goal and implementation of the projects. | |||
| align="center" | ✓ | |||
|- | |||
|Week 5 | |||
| | |||
* Decide on one French cookbook. | |||
* Scan the physical book. | |||
| align="center" | ✓ | |||
|- | |||
|Week 6-7 | |||
| | | | ||
Finish the creation of the dataset | * Give OCR scan for the pages. | ||
* Start to construct the dataset. | |||
| align="center" | ✓ | |||
|- | |||
|Week 8-9 | |||
| | |||
* Prepare for midterm presentaton. | |||
* Construct dataset and think of data structure to store the information. | |||
| align="center" | ✓ | |||
|- | |||
|Week 10 | |||
| | |||
* Set up the GitHub repository. | |||
* Finish the creation of the dataset | |||
| align="center" |✓ | | align="center" |✓ | ||
|-✓ | |-✓ | ||
| | |||
|Week 11 | |||
| | | | ||
* Fix bugs in extraction script and take exceptional cases into consideration. | |||
* Create categories for ingredients. | |||
| align="center" |✓ | | align="center" |✓ | ||
|- | |- | ||
| | |||
|Week 12 | |||
| | | | ||
Exploratory analysis on the dataset | * Perform the data processing of the ingredients. | ||
* Exploratory analysis on the dataset. | |||
| align="center" |✓ | | align="center" |✓ | ||
|- | |- | ||
| | |Week 13 | ||
| | | | ||
Overall analysis & Per Region analysis | * Further improve the dataset. | ||
* Overall analysis & Per Region analysis. | |||
| align="center" |✓ | | align="center" |✓ | ||
|- | |- | ||
| | |Week 14 | ||
| | | | ||
Prepare the final presentation | * Prepare the final presentation | ||
* Finish the Wikipedia page | |||
| align="center" |✓ | | align="center" |✓ | ||
|- | |- |
Revision as of 08:36, 21 December 2022
Introduction
Research questions
1. What were the main ingredients used in 1900 in France?
2. Can we observe a difference per region?
Project Plan and Milestones
Date | Tasks | Completion |
---|---|---|
Week 3 |
|
✓ |
Week 4 |
|
✓ |
Week 5 |
|
✓ |
Week 6-7 |
|
✓ |
Week 8-9 |
|
✓ |
Week 10 |
|
✓ |
Week 11 |
|
✓ |
Week 12 |
|
✓ |
Week 13 |
|
✓ |
Week 14 |
|
✓ |
Methodology
Data collection
For a start, we scanned a physical French cookbook.
The we did a basic OCR for the scanned files. Here is a sample output from OCR.
Data digitalization
Template output of the digitization
Data processing
In our project, we will extract the following information from the recipes:
- quantity
- unit
- ingredient
Units 'litre', 'litres', 'l', 'cl', 'dl', 'kg', 'g', 'pincée', 'cuil.', 'cuil. café', 'cuil. soupe', 'cuil. à soupe', 'petite cuil.', 'grande cuil.', 'verre', 'verres', 'petit verre', 'verre à liqueur', 'verres à liqueur', 'tasse', 'tasses', 'bout.', 'bouteille', 'bouteilles', 'grande boîte', 'gousse', 'gousses', 'branche', 'branches', 'membre', 'membres', 'tronçon', 'tronçons', 'tranche', 'tranches', 'tube', 'tubes',
Data analysis
Data visualization
Links
Github repository: Historical Cookbook
Future work
- Build a search engine that would display the recipes and add filters to search them by name, region or ingredients
- User-friendly interface to visualize the results of the analysis
- Comparison with other cookbooks from different periods or different countries