France: Exploring Historical Cookbooks: Difference between revisions
m (→Introduction) |
(→Introduction: Part 2 of the introduction) |
Line 6: | Line 6: | ||
This is at least the point of view of Mr. Curnonsky who travelled the regions of France throughout his life at the beginning of the 20th century in search of the traditional regional recipes that are the pillars of the French cuisine we know today. His book ''Recettes des Provinces de France'' written in 1962 [Figure 1] gathers many traditional recipes collected by himself all around France. | This is at least the point of view of Mr. Curnonsky who travelled the regions of France throughout his life at the beginning of the 20th century in search of the traditional regional recipes that are the pillars of the French cuisine we know today. His book ''Recettes des Provinces de France'' written in 1962 [Figure 1] gathers many traditional recipes collected by himself all around France. | ||
At a time when all knowledge is shared online on the web, it has become easy to obtain information on the history of French cuisine or even many contemporary recipes. However, a significant amount of knowledge and culinary practices are still stored in books that are much more difficult to access. This knowledge would benefit from being digitalized both to share it with the largest number of people, but also to take advantage of the latest computational techniques to perform more in-depth analyses. | |||
This project is hence an exploration of a historical French cookbook. From the physical book to a clean structured dataset, our main focus is on the digitalization of a historical cookbook and its challenges. In addition to that, we use the collected knowledge to extract analyses to better understand the French cuisine of the previous century. We use the cookbook from Mr. Curnonsky mentioned before as an example to answer our research questions. | |||
This project is an exploration of historical French | |||
== Research questions == | == Research questions == |
Revision as of 22:48, 21 December 2022
Cuisine has an important place in the cultural heritage of France. In the 21st century, the great classics of French cuisine can be found in starred restaurants of most cities of France and even all around the world. But above all, French cuisine owes its current prestige to the different regional cuisines that were developed over several hundred years, taking advantage of the geographical and cultural specificities of each region.
This is at least the point of view of Mr. Curnonsky who travelled the regions of France throughout his life at the beginning of the 20th century in search of the traditional regional recipes that are the pillars of the French cuisine we know today. His book Recettes des Provinces de France written in 1962 [Figure 1] gathers many traditional recipes collected by himself all around France.
At a time when all knowledge is shared online on the web, it has become easy to obtain information on the history of French cuisine or even many contemporary recipes. However, a significant amount of knowledge and culinary practices are still stored in books that are much more difficult to access. This knowledge would benefit from being digitalized both to share it with the largest number of people, but also to take advantage of the latest computational techniques to perform more in-depth analyses.
This project is hence an exploration of a historical French cookbook. From the physical book to a clean structured dataset, our main focus is on the digitalization of a historical cookbook and its challenges. In addition to that, we use the collected knowledge to extract analyses to better understand the French cuisine of the previous century. We use the cookbook from Mr. Curnonsky mentioned before as an example to answer our research questions.
Research questions
1. What were the main ingredients used in 1900 in France?
2. Can we observe a difference per region?
Project Plan and Milestones
Date | Tasks | Completion |
Week 3 |
Milestone 1: Project proposals |
✓ |
Week 4 |
✓ |
Week 5 |
✓ |
Week 6-7 |
✓ |
Week 8-9 |
Milestone 2: Midterm presentation |
✓ |
Week 10 |
✓ |
Week 11 |
✓ |
Week 12 |
✓ |
Week 13 |
✓ |
Week 14 |
Milestone 3: Final presentation |
✓ |
Data collection and digitalization
For a start, we scanned a physical French cookbook. This one is from the 19th century and has ingredients listed on the margin of the page.
Then we did a basic OCR for the scanned files. Here is a sample output from OCR. It is noticeable that there is a mismatch between recipes and ingredients.
Data processing
In our project, we will extract and construct the following information from the recipes:
- quantity: the amount of the ingredient
- unit: the metric of the ingredient
- ingredient: the entity appeared in the recipes
- category: the major category the ingredient belongs to
Type | Unit |
Spoons | cuil. à café, cuil. café, cuil. à soupe, cuil. soupe, petite cuil., grande cuil., cuil. |
Glasses | petit verre, verre à liqueur, verres à liqueur, verres, verre, tasses, tasse |
Bottles | bout., bouteilles, bouteille |
Containers | g rande boîte, boîtes, boîte, tubes, tube |
Spices & Aromatic plants | gousses, gousse, branches, branche, bâtons, bâton, pincée |
Meat related | membres, membre, tronçons, tronçon, tranches, tranche |
Standard measures | litres , litre , cl , dl , kg , g, l |
We map different ingredients to several major categories
Category | Ingredient |
Viande (Meat) | viande, oie, canard, oiseau, lard, bœuf, veau, poule, poulet, poularde, volaille, porc, caille, canard, caneton, mouton, cochon, coq, chevreuil, lièvre, levraut, lapin, faisan, gibier, jambon, chorizo, cervelas, agneau, escargot, grenouille |
Poisson (Fish) | poisson, brochet, carpe, morue, lamproie, lotte, maquereau, omble, rouget, sardine, thon, truite, anchois, anguille, merlan, sole, barbue, turbot, raie, perche, saumon, colin, goujon, loup, congre, rascasse, grondin, merlu, merluza, hareng, alose, brême |
Fruit de mer (Sea food) | crevette, langouste, moule, écrevisse, palourde, homard, chiperon, seiche, huître, coquille, poulpe |
Alcool (Alcohol) | alcool, bière, vin, cidre, fine, liqueur |
Plante aromatique (Aromatic plant) | bouquet garni, ail, anis, aromate, angélique, basilic, persil, sarriette, cerfeuil, ciboule, ciboulette, clou de girofle, clous de girofle, girofle, cive, câpre, estragon, feuille de vigne, fines herbes, laurier, menthe, pissenlit, romarin, thym |
Epice (Spicy) | cannelle, coriandre, curry, safran, poivre, sel, moutarde, muscade, paprika, piment, sauge, serpolet, épices |
Produit laitier (Diary product) | lait, crème, fromage, gruyère, parmesan |
Légume (Vegetable) | artichaut, asperge, aubergine, bette, betterave, cardon, chou, cornichon, courgette, cresson, céleri, fenouil, légume, navet, panais, poireau, pomme de terre, pommes de terre, potiron, rave, salade, tomate, échalote, épinard |
Fruit (Fruit) | abricot, banane, cerise, coing, fraise, framboise, groseille, raisin, olive, orange, pomme |
Agrume (Citrus) | citron, cédrat, fleur d'oranger, fleurs d'oranger |
Céréale (Cereal) | farine, pain, pâte, riz |
Légumineuse (Legume) | févette, haricot, pois |
Fruit sec (Nut) | amande, noix, noisette |
Champignon (Mushroom) | champignon, truffe, cèpe, girofle, morille, levure, oronge, duelle |
We map subregions to 6 major regions in 19th-century France.
Region | Subregion |
Paris, Ile-de-France, Val de Loire | Paris, Ile-de-France, Orléans, Touraine |
Pays de l’Ouest | Anjou, Bretagne, Poitou Vendée, Charentes |
Sud-Ouest & Pyrénées | Bordelais, Gascogne, Pays Basque, Roussillon, Périgord, Languedoc' |
Sud-Est & Méditérannée | Provence, Nice, Corse, Dauphiné, Savoie, Lyon, Auvergne, Limousin |
Bourgogne, Champagne, Bresse, Franche-Comté, Alsace, Lorraine | Bourgogne, Champagne, Bresse, Franche-Comté, Alsace, Lorraine |
Nord & Normandie | Nord, Normandie |
Data analysis and visualization
Dataset Overview
We have a total of 352 different recipes from 30 regions, 6 subregions.
Top 10 most used ingredients.
Rank | Ingredient | Number of occurrences | Picture |
1) | Beurre | 180 | |
2) | Sel | 167 | |
3) | Poivre | 146 | |
4) | Œufs | 101 | |
5) | Oignons | 95 | |
6) | Farine | 89 | |
7) | Persil | 82 | |
8) | Ail | 76 | |
9) | Vin balanc | 69 | |
10) | Bouquet garni | 46 |
Region Analysis
From this graph, we could see that "Plante aromatique" and "Epice" are frequently used by all the six major regions while "Fruit sec" is the least frequently used one.
Subregion Analysis
Co-occurrence Analysis
We could see that "Plante aromatique" and "Epice" appear together a lot, then they appear together with "Viande", "Légume", and "Céréale".
Discussion and limitations
Like many other research, this project has its limitations. For example, in the data analysis part, it was a roughly count of categories and we did not take quantity into account.
Future work
- Build a search engine that would display the recipes and add filters to search them by name, region or ingredients
- User-friendly interface to visualize the results of the analysis
- Comparison with other cookbooks from different periods or different countries