Crawler之Scrapy:Python实现scrapy框架爬虫两个网址下载网页内容信息

泠之屋

关注

阅读 33

2022-02-10


Crawler之Scrapy:Python实现scrapy框架爬虫两个网址下载网页内容信息


目录

​​输出结果​​

​​实现代码​​


输出结果

后期更新……


实现代码

import scrapy
class DmozSpider(scrapy.Spider):
name ="dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"https://dmoztools.net/Computers/Programming/Languages/Python/Resources/"
"https://dmoztools.net/Computers/Programming/Languages/Python/Books/"
]
def parse(self,response):
filename = response.url.split("/")[-2]
with open(filename, 'wb') as f:
f.write(response.body)



精彩评论(0)

0 0 举报