您的位置:首頁技術文章
文章詳情頁

python - 為解決403 加了user-agent,但之后使用urlretrieve就提示正則匹配錯誤

瀏覽:131日期:2022-07-23 17:50:25

問題描述

想寫一個小程式自動下載網頁 http://www.sse.com.cn/assortm... 里面的下載鏈接 http://query.sse.com.cn/secur...用urllib提示403,于是加了user-agent返回200,但之后使用urlretrieve就提示正則匹配錯誤,網上沒找到答案,請問大家要怎么解決這個問題?

代碼如下:

from urllib import request

from datetime import datetime

-- coding:utf-8 --

url = ’http://query.sse.com.cn/secur...’

user_agent = ’Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Mobile Safari/537.36’

myheaders = {’User - Agent’: user_agent}

req = request.Request(url, headers=myheaders)

local = '/Users/Mty/Downloads/s_data/' + str(datetime.now().date()) + ' .xls'

request.urlretrieve(req, local)

報錯:

Traceback (most recent call last): File '/Users/Mty/PycharmProjects/get_data/date.py', line 20, in <module>

request.urlretrieve(req, local)

File '/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py', line 186, in urlretrieve

url_type, path = splittype(url)

File '/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/parse.py', line 861, in splittype

match = _typeprog.match(url)

TypeError: expected string or bytes-like object

問題解答

回答1:

使用request.build_opener 添加head可解決

myheaders = [(’User - Agent’, ’Mozilla/5.0 (Windows; U; Windows NT 5.2) AppleWebKit/525.17’ ’ (KHTML, like Gecko) Version/3.1 Safari/525.17’),]opener = request.build_opener()opener.addheaders = myheadersrequest.install_opener(opener)request.urlretrieve(url, local)

標簽: Python 編程
国产综合久久一区二区三区