【python爬虫】总结反反爬的技巧
1. 当请求失败时重复请求
def get_url(url):
try:
response = requests.get(url, timeout=10) # 超时设置为10秒
except:
for i in range(10): # 循环去请求网站
response = requests.get(url, proxies=proxies, timeout=20)
if response.status_code == 200:
break
return response
2. 适当增加sleep时间
mu, sigma = 5, 0.2 # 正态分布
time.sleep(norm.rvs(mu, sigma))
3. 设置代理ip
proxies = {
'http': '127.0.0.1:1212',
'https': '127.0.0.1:1212'
}
response = requests.get(url, proxies=proxies, timeout=20)
原文地址:https://blog.csdn.net/qq_42251120/article/details/140108043
免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!