France: Exploring Historical Cookbooks: Difference between revisions
Xiaotian.su (talk | contribs) |
Xiaotian.su (talk | contribs) |
||
Line 293: | Line 293: | ||
--> | --> | ||
====Region Analysis==== | ====Region Analysis==== | ||
From this graph, we could see that "Plante aromatique" and "Epice" are frequently used by all the six major regions while "Fruit sec" is the least frequently used one. | |||
[[File: Heatmap_region.png|800px|center|thumb| Heatmap of categories by region.]] | [[File: Heatmap_region.png|800px|center|thumb| Heatmap of categories by region.]] | ||
====Subregion Analysis==== | ====Subregion Analysis==== | ||
[[File: Heatmap_subregion.png|800px|center|thumb| Heatmap of categories by subregion.]] | [[File: Heatmap_subregion.png|800px|center|thumb| Heatmap of categories by subregion.]] |
Revision as of 21:52, 21 December 2022
Introduction
This project is an exploration of historical French cookbooks containing recipes from the 19th century. Through analyzing these cookbooks, we explore the most frequently used ingredients and food categories by region. We also examine the cooccurrence of ingredients and food categories.
Research questions
1. What were the main ingredients used in 1900 in France?
2. Can we observe a difference per region?
Project Plan and Milestones
Date | Tasks | Completion |
---|---|---|
Week 3 |
Milestone 1: Project proposals |
✓ |
Week 4 |
|
✓ |
Week 5 |
|
✓ |
Week 6-7 |
|
✓ |
Week 8-9 |
Milestone 2: Midterm presentation |
✓ |
Week 10 |
|
✓ |
Week 11 |
|
✓ |
Week 12 |
|
✓ |
Week 13 |
|
✓ |
Week 14 |
Milestone 3: Final presentation |
✓ |
Methodology
Data collection and digitalization
For a start, we scanned a physical French cookbook. This one is from the 19th century and has ingredients listed on the margin of the page.
Then we did a basic OCR for the scanned files. Here is a sample output from OCR. It is noticeable that there is a mismatch between recipes and ingredients.
Data processing
In our project, we will extract and construct the following information from the recipes:
- quantity: the amount of the ingredient
- unit: the metric of the ingredient
- ingredient: the entity appeared in the recipes
- category: the major category the ingredient belongs to
Units
Type | Unit |
---|---|
Spoons | cuil. à café, cuil. café, cuil. à soupe, cuil. soupe, petite cuil., grande cuil., cuil. |
Glasses | petit verre, verre à liqueur, verres à liqueur, verres, verre, tasses, tasse |
Bottles | bout., bouteilles, bouteille |
Containers | g rande boîte, boîtes, boîte, tubes, tube |
Spices & Aromatic plants | gousses, gousse, branches, branche, bâtons, bâton, pincée |
Meat related | membres, membre, tronçons, tronçon, tranches, tranche |
Standard measures | litres , litre , cl , dl , kg , g, l |
Categories
We map different ingredients to several major categories
Category | Ingredient |
---|---|
Viande (Meat) | viande, oie, canard, oiseau, lard, bœuf, veau, poule, poulet, poularde, volaille, porc, caille, canard, caneton, mouton, cochon, coq, chevreuil, lièvre, levraut, lapin, faisan, gibier, jambon, chorizo, cervelas, agneau, escargot, grenouille |
Poisson (Fish) | poisson, brochet, carpe, morue, lamproie, lotte, maquereau, omble, rouget, sardine, thon, truite, anchois, anguille, merlan, sole, barbue, turbot, raie, perche, saumon, colin, goujon, loup, congre, rascasse, grondin, merlu, merluza, hareng, alose, brême |
Fruit de mer (Sea food) | crevette, langouste, moule, écrevisse, palourde, homard, chiperon, seiche, huître, coquille, poulpe |
Alcool (Alcohol) | alcool, bière, vin, cidre, fine, liqueur |
Plante aromatique (Aromatic plant) | bouquet garni, ail, anis, aromate, angélique, basilic, persil, sarriette, cerfeuil, ciboule, ciboulette, clou de girofle, clous de girofle, girofle, cive, câpre, estragon, feuille de vigne, fines herbes, laurier, menthe, pissenlit, romarin, thym |
Epice (Spicy) | cannelle, coriandre, curry, safran, poivre, sel, moutarde, muscade, paprika, piment, sauge, serpolet, épices |
Produit laitier (Diary product) | lait, crème, fromage, gruyère, parmesan |
Légume (Vegetable) | artichaut, asperge, aubergine, bette, betterave, cardon, chou, cornichon, courgette, cresson, céleri, fenouil, légume, navet, panais, poireau, pomme de terre, pommes de terre, potiron, rave, salade, tomate, échalote, épinard |
Fruit (Fruit) | abricot, banane, cerise, coing, fraise, framboise, groseille, raisin, olive, orange, pomme |
Agrume (Citrus) | citron, cédrat, fleur d'oranger, fleurs d'oranger |
Céréale (Cereal) | farine, pain, pâte, riz |
Légumineuse (Legume) | févette, haricot, pois |
Fruit sec (Nut) | amande, noix, noisette |
Champignon (Mushroom) | champignon, truffe, cèpe, girofle, morille, levure, oronge, duelle |
Region
We map subregions to 6 major regions in 19th-century France.
Region | Subregion |
---|---|
Paris, Ile-de-France, Val de Loire | Paris, Ile-de-France, Orléans, Touraine |
Pays de l’Ouest | Anjou, Bretagne, Poitou Vendée, Charentes |
Sud-Ouest & Pyrénées | Bordelais, Gascogne, Pays Basque, Roussillon, Périgord, Languedoc' |
Sud-Est & Méditérannée | Provence, Nice, Corse, Dauphiné, Savoie, Lyon, Auvergne, Limousin |
Bourgogne, Champagne, Bresse, Franche-Comté, Alsace, Lorraine | Bourgogne, Champagne, Bresse, Franche-Comté, Alsace, Lorraine |
Nord & Normandie | Nord, Normandie |
Data analysis and visualization
Dataset Overview
We have a total of 352 different recipes from 30 regions, 6 subregions.
Top 10 most used ingredients.
Rank | Ingredient | Number of occurrences | Picture |
---|---|---|---|
1) | Beurre | 180 | |
2) | Sel | 167 | |
3) | Poivre | 146 | |
4) | Œufs | 101 | |
5) | Oignons | 95 | |
6) | Farine | 89 | |
7) | Persil | 82 | |
8) | Ail | 76 | |
9) | Vin balanc | 69 | |
10) | Bouquet garni | 46 |
Region Analysis
From this graph, we could see that "Plante aromatique" and "Epice" are frequently used by all the six major regions while "Fruit sec" is the least frequently used one.
Subregion Analysis
Co-occurrence Analysis
Discussion and limitations
Like many other research, this project has its limitations. For example, in the data analysis part, it was a roughly count of categories and we did not take quantity into account.
Future work
- Build a search engine that would display the recipes and add filters to search them by name, region or ingredients
- User-friendly interface to visualize the results of the analysis
- Comparison with other cookbooks from different periods or different countries