Chinese Cookbook
Introduction
Motivation and description of the deliverables
Project Plan and Milestones
Weekly Project Plan
Date | Data Collection | Data Processing | Data Analysis | Web Construction |
---|---|---|---|---|
Week 3 | Search historical Chinese cookbooks and compare them | |||
Week 4 | Choose one historical Chinese cookbook
|
|||
Week 5 | Get data from the website | First clean and sort the data
|
||
Week 6 | Construct the dataset of ingredients
|
|||
Week 7 | Categorize data | |||
Week 8 | Analyse cooking method, effect, category, and ingredient frequency
|
Start web construction | ||
Week 9 | Analyse ingredient and ingredient category pairing
|
Continue web construction | ||
Week 10 | Analyse effect and ingredient pairing
|
Continue web construction
| ||
Week 11 | Modren Chinese to English translation | Continue web construction
| ||
Week 12 | Modren Chinese to English translation | Continue web construction
| ||
Week 13 | Finalize and improve the website | |||
Week 14 | Prepare the Wikipedia page & final presentation |
Milestone 1
- Prepare a project proposal and the goal and objective of the project
- Get Chinese cooking book data from the website
Milestone 2
- Clean the data and construct the datasets for the Chinese cooking book
- Translate from Ancient Chinese and Modern Chinese
- Categorize the data depending on ingredient, effect, category, and cooking method
Milestone 3
- Data Analysis
- Web construction and recipe filtering and recommendation system
- Prepare final presentation and Wikipedia page
Methods
Data Collection
"Yinshanzhengyao" was published in 1330 during the Yuan dynasty, and all existing editions are derived from the Ming dynasty edition of 1456. Despite the presence of a scanned version of the book on the internet, Optical Character Recognition (OCR) poses a challenge due to the ancient Chinese text and the inclusion of illustrations. Fortunately, the Chinese Text Project (中國哲學書電子書計劃) has undertaken the noble initiative of providing open access to ancient Chinese books for both Chinese and non-Chinese scholars, resulting in the creation of a comprehensive database. Currently, it encompasses over thirty thousand books, making it the largest among historical Chinese literature databases.
"Yinshanzhengyao" is among the books included in this extensive database. Leveraging the well-defined structure of the database, we scrape data from the website. Given the project's specific focus on the recipes within the book, our data extraction is limited to the recipe content, which includes the following chapter:
- Strange Delicacies of Combined Flavours 1, 2, 3
- Various Hot Beverages and Concentrates
- Foods that Cure Various Illnesses
In total, there are 210 recipes, each accompanied by information on its effects, ingredients with quantities, and step-by-step instructions. As a medical text, the "effect" refers to the benefits of the food and precautions to be taken, providing valuable insights into the medicinal properties of the recipes.
Data Processing
Construct dataset
Upon acquiring data from the website, the imperative task is to clean and structure the data into the following format: Food_Name, Effect, Ingredients, Steps. However, due to the website not well separating these four pieces of information, we developed functions to customarily construct this dataframe. Below are examples of recipe dataframes.
Food_Name_en | Effect_en | Ingredients_en | Steps_en | Food_Name | Effect | Ingredients | Steps |
---|---|---|---|---|---|---|---|
Ghee Liquor | It cures asthenia and removes wind-wetness. | Ghee (one bowl). | Mix ingredient with a cup of liquor and drink warm. It is proven effective. | 醍醐酒 | 治虚弱,去风湿。 | 醍醐 一 盏 | 上件,以酒一杯和匀,温饮之,效验。 |
Chinese Mallow Gruel | It cures urine which is retained and does not pass. | Mallow leaves. (It does not matter if they are many or few. Wash, select and clean.) | Boil ingredient into a gruel. Add the five spices. Eat on an empty stomach. | 葵菜羹 | 治小便癃闭不通。 | 葵菜叶 不以多少,洗择净 | 右煮作羹,入五味,空腹食之。 |
Donkey’s Meat Soup | It cures wind mania and depression and pacifies heart qi. | Meat of a black donkey. (The quantity does not matter. Cut up.) | Cook ingredient until overcooked in fermented black beans. When done add the five spices. Eat on an empty stomach. | 驴肉汤 | 治风狂,忧愁不乐,安心气。 | 乌驴肉 不以多少,切 | 上件,于豆豉中,烂煮熟,入五味,空心食之。 |
Sprouting Chinese Foxglove Chicken | It treats pain of the back and loins, deficiency and injury of bone and marrow, inability to stand upright for long periods, heaviness of body and qi shortage, night sweating and lack of appetite, and occasional vomiting and dysentery. | Sprouting Chinese foxglove (half a jin), sweetmeats (five liang), and black chicken (one). | [Of the] ingredients first take the chicken, pluck, remove the giblets and clean. Cut up finely. Combine the Chinese foxglove and the sugar together evenly. Put the mixture inside the intestinal cavity of the chicken. Put it into a copper pot. Then put the copper pot into a cauldron and steam. When the dish has been cooked completely, remove the chicken and eat. Do not use salt or vinegar. Eat the meat. When it is gone also drink the broth. | 生地黄鸡 | 治腰背疼痛,骨髓虚损,不能久立,身重气乏,盗汗,少食,时复吐利。 | 生地黄 半斤 饴糖 五两 乌鸡 一 枚 | 右三味,先将鸡去毛、肠肚净,细切,地黄与糖相和匀,内鸡腹中,以铜器中放之,复置甑中蒸炊,饭熟成,取食之。不用盐醋,唯食肉尽却饮汁。 |
Pig Kidney Congee | It cures kidney xulao damage, debility and ache of waist and knee. | Pig kidney (one, remove fatty tissue and slice), non-glutinous rice (three ho), tsaoko cardamom (three), prepared mandarin orange peel (one qian; remove white), grain-of-paradise (two qian). | [Of] ingredients, first take the pig kidney, the prepared mandarin orange peel etc. and boil to make a juice. Strain and remove the dregs. Add a small amount of liquor. Then add the rice to make a congee. Eat on an empty stomach. | 猪肾粥 | 治肾虚劳损,腰膝无力,疼痛。 | 猪肾 一对,去脂膜,切 粳米 三合 草果 二钱 陈皮 一钱,去白 缩砂 二钱 | 上件,先将猪肾、陈皮等煮成汁,滤去滓,入酒少许,次下米成粥,空心食之。 |
Categorize
To facilitate further data analysis and implement the search function for the website, the initial step involves categorizing our data. This categorization encompasses 15 recipe categories, 10 cooking methods, 13 ingredient categories, and 9 effect categories. The detailed categories are as follows:
- Recipe categories: 'Paste', 'Pan-fry', 'Dish', 'Thick soup', 'Conge', 'Meat', 'Soup', 'Noodles', 'Rice noodles', 'Pancake', 'Thick liquid', 'Oil', 'Tea', 'Wonton', 'Steamed bun'
Category_en | Food_Name_en | Category | Food_Name |
---|---|---|---|
Pancake | Cow’s Milk Buns | 饼 | 牛奶子烧饼 |
Dish | Boiled Sheep’s Breast | 菜品 | 熬羊胸子 |
Steamed bun | Eggplant Manta | 馒头 | 茄子馒头 |
Soup | Carp Soup | 汤 | 鲤鱼汤 |
Tea | Jade Mortar Tea | 茶 | 玉磨茶 |
- Cooking methods: 'Boil', 'Simmer', 'Steam', 'Pan-fry', 'Bake', 'Braise', 'Broil', 'Stir-fry', 'Roast', 'Deep-fry'
Method_en | Food_Name_en | Method | Food_Name |
---|---|---|---|
Boil | Bream Gruel | 煮 | 鲫鱼羹 |
Simmer | Qima Congee | 熬 | 乞马粥 |
Steam | Sheep’s Head Hash | 蒸 | 羊头脍 |
Pan-fry | Cherry [Prunus pseudocerasus] Concentrate | 煎 | 樱桃煎 |
Bake | Dried Beef | 焙 | 牛肉脯 |
Braise | Turmeric[–Colored] Fish | 淹 | 姜黄鱼 |
Broil | Broiled Sheep’s Heart | 炙 | 炙羊心 |
Stir-fry | Cotton Rose–[Petal] Chicken | 炒 | 芙蓉鸡 |
Roast | Roast Wild Goose (Roast Cormorant and Roast Duck are the same) | 烧 | 烧雁 |
Deep-fry | Scalded Jasa’a (a delicacy) | 炸 | 炸䐑儿 |
- Ingredient categories: 'Chinese Medicinal Material', 'Plant', 'Spice', 'Fruit', 'Vegetable', 'Seafood', 'Meat', 'Dairy Product', 'Grain', 'Juice', 'Condiment', 'Tea', 'Other'
Ingredient_en | Category | Ingredient |
---|---|---|
ginseng | chinese_medicinal_material | 人参 |
Purple perilla leaf | plant | 紫苏叶 |
tsaoko cardamom | spice | 草果 |
Cherries | fruit | 樱桃 |
Chinese radish | vegetable | 萝卜 |
- Effect categories:'Gastrointestinal Issues', 'Neurological and Mental Health', 'General Health and Wellness', 'Musculoskeletal Issues', 'Speech-related', 'Heat-clearing', 'Genitourinary Issues', 'Respiratory Issues', 'Others'
Effect_en | Category | Effect |
---|---|---|
stop thirst cure cough | Gastrointestinal Issues | 生津止渴 |
this cures apoplexy | Neurological and Mental Health | 治中风 |
augments qi | General Health and Wellness | 益气和中 |
the medullae strengthens sinew and bone | Musculoskeletal Issues | 壮筋骨 |
obstruction of the large intestine | Speech-related | 语言蹇涩 |
cures heat of the center | Heat-clearing | 治中热 |
The following presents five examples of recipes along with their all categorized information.
Recipe_name | Category | Cooking Method | Effect Category | Ingredient Category |
---|---|---|---|---|
Sprouting Chinese Foxglove Chicken | Dish | 'Steam' | 'Musculoskeletal Issues', 'General Health and Wellness' | 'meat', 'condiment', 'chinese_medicinal_material' |
Carp Soup | Soup | 'Braise', 'Boil' | 'Gastrointestinal Issues', 'Genitourinary Issues', 'Others' | 'seafood' 'spice' 'plant' |
Oil Rape Shoots Broth | Thick soup | 'Simmer' | 'General Health and Wellness' | 'spice' 'meat' |
Barley Samsa Noodles | Rice noodles | 'Stir-fry', 'Simmer' | 'General Health and Wellness', 'Gastrointestinal Issues' | 'plant' 'meat' |
Sheep Bone Congee | Conge | Simmer' | 'General Health and Wellness', 'Musculoskeletal Issues' | 'condiment' 'spice' 'plant' |
Translation: Ancient Chinese to Modern Chinese
The text is originally in ancient Chinese, but contemporary communication predominantly uses modern Chinese. Therefore, for thorough data analysis, it is crucial to translate the recipes from ancient Chinese to modern Chinese. In our assessment, we compared the proficiency of a translation model for ancient Chinese to modern Chinese (Figure 2) against ChatGPT 3.5 (Figure 3). Our findings indicate that the translations generated by ChatGPT are more fluent and closely aligned with contemporary language usage. Based on this observation, we have opted to use ChatGPT as our primary translation tool. Furthermore, since translating the Food Name and Ingredients would be meaningless, the translation will only be implemented for the Effect and Steps.
Ancient Chinese recipes
Food_Name | Effect | Ingredients | Steps |
---|---|---|---|
醍醐酒 | 治虚弱,去风湿。 | 醍醐 一 盏 | 上件,以酒一杯和匀,温饮之,效验。 |
羊蜜膏 | 治虚劳,腰痛,咳嗽,肺痿,骨蒸。 | 熟羊脂 五两 熟羊髓 五两 白沙蜜 五两,炼净 生姜汁 一合 生地黄汁 五合 | 右五味,先以羊脂煎令沸,次下羊髓又令沸,次下蜜、地黄、生姜汁,不住手搅,微火熬数沸成膏。每日空心温酒调一匙头。或作羹汤,或作粥食之亦可。 |
驴肉汤 | 治风狂,忧愁不乐,安心气。 | 乌驴肉 不以多少,切 | 上件,于豆豉中,烂煮熟,入五味,空心食之。 |
生地黄鸡 | 治腰背疼痛,骨髓虚损,不能久立,身重气乏,盗汗,少食,时复吐利。 | 生地黄 半斤 饴糖 五两 乌鸡 一 枚 | 右三味,先将鸡去毛、肠肚净,细切,地黄与糖相和匀,内鸡腹中,以铜器中放之,复置甑中蒸炊,饭熟成,取食之。不用盐醋,唯食肉尽却饮汁。 |
猪肾粥 | 治肾虚劳损,腰膝无力,疼痛。 | 猪肾 一对,去脂膜,切 粳米 三合 草果 二钱 陈皮 一钱,去白 缩砂 二钱 | 上件,先将猪肾、陈皮等煮成汁,滤去滓,入酒少许,次下米成粥,空心食之。 |
Modern Chinese recipes
Food_Name | Effect | Ingredients | Steps |
---|---|---|---|
醍醐酒 | 用于治疗虚弱、风湿。 | 醍醐 一 盏 | 用一杯酒和匀,温热饮用,具有特定效果。 |
羊蜜膏 | 用于治疗虚弱、腰痛、咳嗽、肺痿和骨髓虚蒸。 | 熟羊脂 五两 熟羊髓 五两 白沙蜜 五两,炼净 生姜汁 一合 生地黄汁 五合 | 五种食材,首先用羊脂煎热,然后加入羊髓,再次加热。接着加入蜜、地黄和生姜汁,不停搅拌,用小火熬煮数次,直到成为膏状。每天空腹温酒,加入一勺膏头。可以做成羹汤,也可以做成粥来食用。 |
驴肉汤 | 能够治疗精神紊乱、情绪不佳、安定心神。 | 乌驴肉 不以多少,切 | 上述食材,于豆豉中,烂煮熟,入五味,空心食之。 |
生地黄鸡 | 用于治疗腰背疼痛、骨髓虚损、不能久立、身体沉重、精力不足、盗汗、食欲不振、时常呕吐和腹泻。 | 生地黄 半斤 饴糖 五两 乌鸡 一 枚 | 三种食材,首先将鸡去毛,清理肠胃,切成细块。将地黄与糖混合均匀,然后放入鸡的腹中,放在铜器中,再置于甑中蒸煮,直到饭熟。然后取出食用,无需加盐和醋,只吃掉肉,不饮下汤。 |
猪肾粥 | 用于治疗肾虚劳损、腰膝疼痛。 | 猪肾 一对,去脂膜,切 粳米 三合 草果 二钱 陈皮 一钱,去白 缩砂 二钱 | 首先将猪肾和陈皮煮成汁,去除杂质,加入少量酒,然后加入大米煮成粥,空腹食用。 |
English Translation
In the process of English translation, we discovered a digital document online containing the English version of the book (YinShanZhengYao_english.pdf). As a result, we extracted the English-translated recipes from this document and established an English database for the recipes. Following this, we compiled a comprehensive recipe dataset that incorporates both English and Chinese versions. For the categorization of English data, leveraging the pre-existing categorized data in Chinese, we developed a mapping method to establish a correspondence between Chinese and English categories. All tables above, containing English content, are the outcome of mapping to the Chinese dataframe generated earlier.
Data Analysis
Website
Quality assessment??
Discussion and limitations
Links
- GitHub