0
点赞
收藏
分享

微信扫一扫

Python爬虫案例:爬取酷狗音乐全排行榜歌曲


前言

本文的文字及图片,仅供学习、交流使用,不具有任何商业用途,版权归原作者所有,如有问题请及时联系我们以作处理

本次目标

爬取酷狗音乐全站排行榜歌曲

Python爬虫案例:爬取酷狗音乐全排行榜歌曲_html


Python爬虫案例:爬取酷狗音乐全排行榜歌曲_1024程序员节_02


目标地址

https://www.kugou.com/yy/html/rank.html?from=homepage

环境

Python3.6.5

pycharm

Python爬虫案例:爬取酷狗音乐全排行榜歌曲_爬虫_03


爬虫代码

调入工具

import requests
import re
import parsel

请求网站

headers = {
'authority': 'wwwapi.kugou.com',
'cookie': 'kg_mid=ac3836df72c523f46a85d8a5fd90fe59; kg_dfid=3ve7aQ2XyGmN0yE3uv3WcaHs; Hm_lvt_aedee6983d4cfc62f509129360d6bb3d=1600260110,1602312707; kg_dfid_collect=d41d8cd98f00b204e9800998ecf8427e; kg_mid_temp=ac3836df72c523f46a85d8a5fd90fe59; Hm_lpvt_aedee6983d4cfc62f509129360d6bb3d=1602312738',
'referer': 'https://www.kugou.com/song/',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',
}
url = 'https://www.kugou.com/yy/html/rank.html'
response = requests.get(url=url, headers=headers)

解析网站数据

def func(url):
response = requests.get(url=url, headers=headers)
response.encode = response.apparent_encoding
hashs = re.findall('"Hash":"(.*?)"', response.text, re.S)
album_ids = re.findall('"album_id":(.*?),"', response.text, re.S)
FileNames = re.findall('"FileName":"(.*?)"', response.text, re.S)
data = zip(hashs, album_ids, FileNames)
for i in data:
hash = i[0]
album_ids = i[1]
FileName = i[2].encode('utf-8').decode('unicode_escape')
# print(hash, album_ids, FileName)

download_url = 'https://wwwapi.kugou.com/yy/index.php'
params = {
'r': 'play/getdata',
'callback': 'jQuery19107150201841602037_1602314563329',
'hash': '{}'.format(hash),
'album_id': '{}'.format(album_ids),
'dfid': '3ve7aQ2XyGmN0yE3uv3WcaHs',
'mid': 'ac3836df72c523f46a85d8a5fd90fe59',
'platid': '4',
'_': '1602312793005',
}

for i in html_data:
page_url = i[0]
name = i[1]
print(page_url)
func(page_url)
print('==========================正在爬取{}歌曲========================'.format(name))

保存数据

def download(url, title):
filename = '保存地址' + title + '.mp3'
response = requests.get(url=url, headers=headers)
with open(filename, mode='wb') as f:
f.write(response.content)
print(title)

运行代码,效果如下图

Python爬虫案例:爬取酷狗音乐全排行榜歌曲_html_04


Python爬虫案例:爬取酷狗音乐全排行榜歌曲_python_05


举报

相关推荐

0 条评论