爬取dangdang网书籍全部信息-CFANZ编程社区

爬取dangdang网书籍全部信息

今天爬取dangdang网里的商品信息，做一个表格

价格评论和标题等信息

可以在公众号回复当当网获取源代码学习

目标网站：

http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-recent7-0-0-1-1

首先进行访问，获取该网页源代码

url = 'http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-recent7-0-0-1-1'
response = requests.get(url=url, headers=headers)
res = response.tex

然后这里用BS4来解析提取数据

标题，评论数量，作者，价钱等等

soup = BeautifulSoup(res, 'html.parser')


for i in range(0,20):
    dit['name'] = soup.find_all('div',class_='name')[i].text
    dit['comments'] = soup.find_all('div',class_='star')[i].text
    dit['writer'] = soup.find_all('div',class_="publisher_info")[i*2].text
    dit['Date of publication'] = soup.find_all('div',class_="publisher_info")[i*2+1].find_next('span').text
    dit['publishing house'] = soup.find_all('div',class_="publisher_info")[i*2+1].find_next('a').text
    dit['price'] = soup.find_all('div',class_='price')[i].find_next('p').find_next('span',class_='price_n').text     dit['E-books_price'] = soup.find_all('p',class_="price_e")[i].find_next('span').text

在进行一个多页的爬取

用一个for循环

爬取dangdang网书籍全部信息_公众号