0
点赞
收藏
分享

微信扫一扫

Indian Food Recipes Dataset(印度食品配方数据集)


原文:

When I browsed for a Food Recipes (Especially Indian Food) Dataset, I could not find one (that I could use) online. So, I decided to create one.

The dataset has following fields (self-explanatory) - ['RecipeName', 'TranslatedRecipeName', 'Ingredients', 'TranslatedIngredients', 'Prep', 'Cook', 'Total', 'Servings', 'Cuisine', 'Course', 'Diet', 'Instructions', 'TranslatedInstructions']. The datset contains a csv and a xls file. Sometimes, the content in Hindi is not visible in the csv format.

You might be wondering what the columns with the prefix 'Translated' are. So, a lot of entries in the dataset were in Hindi language. To take care of such entries and translating them to English for consistency, I went ahead and used 'googletrans'. It is a python library that implements Google Translate API underneath.

The code for the crawler, cleaning and transformation is on my Github repository (@kanishk307).

The dataset has been created using Archana's Kitchen Website. It is a great website and hosts a ton of useful content. You should definitely consider viewing it if you are interested.

The dataset can be used to answer a lot of questions related to Food Recipes. You can see the explore the serving sizes, time required to prepare a dish, most common ingredients, different cuisines, diets, courses and what not. I hope this dataset helps the Analytics community.

 

译:

当我浏览一个食品配方(尤其是印度食品)数据集时,我在网上找不到(我可以使用的)一个。所以,我决定创造一个。

数据集具有以下字段(不言而喻)-[“RecipeName”、“TranslatedRecipeName”、“配料”、“TranslatedGredients”、“Prep”、“Cook”、“Total”、“Servings”、“Cuision”、“Course”、“Diet”、“Instructions”、“TranslatedInstructions”]。数据集包含一个csv和一个xls文件。有时,印地语的内容在csv格式中不可见。

您可能想知道前缀为“Translated”的列是什么。数据集中有很多是印地语的条目。为了处理这些条目并将它们翻译成英语以保持一致性,我继续使用“googletrans”。它是一个python库,在底层实现googletranslateapi。

爬虫、清理和转换的代码在我的Github存储库(@kanishk307)上。

这个数据集是使用archna的厨房网站创建的。它是一个伟大的网站,拥有大量有用的内容。如果你感兴趣的话,你一定要考虑看。

这个数据集可以用来回答许多与食物配方有关的问题。你可以看到探索的服务大小,所需的时间准备一道菜,最常见的配料,不同的菜系,饮食,课程等等。我希望这个数据集能帮助分析界。

大家可以到官网地址下载数据集,我自己也在百度网盘分享了一份。

链接:​​获取数据集​​

举报

相关推荐

0 条评论