原文:
This dataset includes several months (and counting) of data on daily trending YouTube videos. Data is included for the US, GB, DE, CA, and FR regions (USA, Great Britain, Germany, Canada, and France, respectively), with up to 200 listed trending videos per day.
EDIT: Now includes data from RU, MX, KR, JP and IN regions (Russia, Mexico, South Korea, Japan and India respectively) over the same time period.
Each region’s data is in a separate file. Data includes the video title, channel title, publish time, tags, views, likes and dislikes, description, and comment count.
The data also includes a category_id field, which varies between regions. To retrieve the categories for a specific video, find it in the associated JSON. One such file is included for each of the five regions in the dataset.
译:
这个数据集包含了几个月(和计数)的数据,这些数据来自于YouTube的每日趋势视频。数据包括美国、英国、德国、加拿大和法国地区(分别为美国、英国、德国、加拿大和法国),每天最多有200个列出的趋势视频。
编辑:现在包括来自俄罗斯、墨西哥、韩国、日本和印度的同期数据。
每个区域的数据都在一个单独的文件中。数据包括视频标题、频道标题、发布时间、标记、视图、喜欢和不喜欢、描述和评论计数。
数据还包括一个category_id字段,该字段因地区而异。要检索特定视频的类别,请在相关联的JSON中找到它。对于数据集中的五个区域,每个区域都包含一个这样的文件。
大家可以到官网地址下载数据集,我自己也在百度网盘分享了一份。可关注本人公众号,回复“2020100802”获取下载链接。