0
点赞
收藏
分享

微信扫一扫

【scrapy笔记】

爱动漫建模 2022-03-31 阅读 31
python爬虫

运行

1.新建run.py 每次运行这个

from scrapy import cmdline
cmdline.execute('scrapy crawl google'.split())

2.命令行

scrapy crawl google

yield scrapy.Request()不响应

1.allowed_domains = [“xxxxx”] 没写对

allowed_domains = ['play.google.com/store/apps']#对
allowed_domains = ['https://play.google.com/store/apps']#错

2.dont_filter=True 添加(有可能是传入网址过滤掉,dont_filter=True为不过滤)

 yield scrapy.Request(
                url=item['GURL'],
                callback=self.parse_addr_list,
                meta={"item": item},
                dont_filter=True
            )
举报

相关推荐

0 条评论