cookie的使用
网络部分信息或APP的信息,若是想获取数据时,需要提前做一些操作,往往是需要登录,或者提前访问过某些页面才可以获取到!!
其实底层就是在网页里面增加了Cookie信息
代码
from urllib.request import Request,build_opener
from fake_useragent import UserAgent
url ='https://www.kuaidaili.com/usercenter/overview'
headers = {
'User-Agent':UserAgent().chrome,
'Cookie':'channelid=0; sid=1621786217815170; _ga=GA1.2.301996636.1621786363; _gid=GA1.2.699625050.1621786363; Hm_lvt_7ed65b1cc4b810e9fd37959c9bb51b31=1621786363,1621823311; _gat=1; Hm_lpvt_7ed65b1cc4b810e9fd37959c9bb51b31=1621823382; sessionid=48cc80a5da3a451c2fa3ce682d29fde7'
}
req = Request(url,headers= headers)
opener = build_opener()
resp = opener.open(req)
print(resp.read().decode())
登录后保持cookie
问题
不想手动复制cookie,太繁琐了!
解决方案
在再代码中执行登录操作,并保持Cookie不丢失
为了保持Cookie不丢失可以urllib.request.HTTPCookieProcessor
来扩展opener的功能
代码
from urllib.request import Request,build_opener
from fake_useragent import UserAgent
from urllib.parse import urlencode
from urllib.request import HTTPCookieProcessor
login_url ='https://www.kuaidaili.com/login/'
args = {
'username':'398707160@qq.com',
'passwd':'123456abc'
}
headers = {
'User-Agent':UserAgent().chrome
}
req = Request(login_url,headers= headers,data = urlencode(args).encode())
# 创建一个可以保存cookie的控制器对象
handler = HTTPCookieProcessor()
# 构造发送请求的对象
opener = build_opener(handler)
# 登录
resp = opener.open(req)
'''
-------------------------上面已经登录好----------------------------------
'''
index_url ='https://www.kuaidaili.com/usercenter/overview'
index_req = Request(index_url,headers =headers)
index_resp = opener.open(index_req)
print(index_resp.read().decode())
cookie的保存与加载
1 原理
2 CookieJar
我们可以利用本模块的http.cookiejar.CookieJar
类的对象来捕获cookie并在后续连接请求时重新发送,比如可以实现模拟登录功能。该模块主要的对象有CookieJar、FileCookieJar、MozillaCookieJar、LWPCookieJar
from urllib.request import Request,build_opener,HTTPCookieProcessor
from fake_useragent import UserAgent
from urllib.parse import urlencode
from http.cookiejar import MozillaCookieJar
def get_cookie():
url = 'https://www.kuaidaili.com/login/'
args = {
'username':'398707160@qq.com',
'passwd':'123456abc'
}
headers = {'User-Agent':UserAgent().chrome}
req = Request(url,headers = headers, data = urlencode(args).encode())
cookie_jar = MozillaCookieJar()
handler = HTTPCookieProcessor(cookie_jar)
opener = build_opener(handler)
resp = opener.open(req)
# print(resp.read().decode())
cookie_jar.save('cookie.txt',ignore_discard=True,ignore_expires=True)
def use_cookie():
url = 'https://www.kuaidaili.com/usercenter/'
headers = {'User-Agent':UserAgent().chrome}
req = Request(url,headers = headers)
cookie_jar = MozillaCookieJar()
cookie_jar.load('cookie.txt',ignore_discard=True,ignore_expires=True)
handler = HTTPCookieProcessor(cookie_jar)
opener = build_opener(handler)
resp = opener.open(req)
print(resp.read().decode())
if __name__ == '__main__':
# get_cookie()
use_cookie()