0
点赞
收藏
分享

微信扫一扫

Python爬虫:splash的安装与简单示例


安装splash

1、安装docker(参考:​​mac安装docker​​)

2、安装splash

docker pull scrapinghub/splash  # 安装

docker run -p 8050:8050 scrapinghub/splash # 运行

访问测试: ​​​http://localhost:8050/​​​

Python爬虫:splash的安装与简单示例_参数说明

代码示例

import requests
import time
from scrapy import Selector


def timer(func):
def inner(*args):
start = time.time()
response = func(*args)
print("time: %s" % (time.time() - start))
return response
return inner


@timer
def use_request(url):
return requests.get(url)


@timer
def use_splash(url):
splash_url = "http://localhost:8050/render.html"

args = {
"url": url,
"timeout": 5,
"image": 0
}

return requests.get(splash_url, params=args)


if __name__ == '__main__':

url = "http://quotes.toscrape.com/js/"

r1 = use_request(url)
sel1 = Selector(r1)
text = sel1.css(".quote .text::text").extract_first()
print(text)

r2 = use_splash(url)
sel2 = Selector(r2)
text = sel2.css(".quote .text::text").extract_first()
print(text)

"""
time: 0.632809877396
None

time: 0.685022830963
“The world as we have created it is a process of our thinking.
It cannot be changed without changing our thinking.”
"""

通过测试,发现需要splash对网页进行了渲染,获取到了数据,而且速度还很快

args参数说明:

url: 需要渲染的页面地址

timeout: 超时时间

proxy:代理

wait:等待渲染时间

images: 是否下载,默认1(下载)

js_source: 渲染页面前执行的js代码


参考
​​Scrapy-Splash的介绍、安装以及实例​​




举报

相关推荐

0 条评论