自学内容网 自学内容网

用不同的url头利用Python访问一个网站,把返回的东西保存为txt文件

这个需要调用requests模块(相当于c++的头文件)

import requests 

 还需要一个User-Agent头(这个意思就是告诉python用的什么系统和浏览器)

Google Chrome(Windows):

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36

Mozilla Firefox(Windows):

Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.

Microsoft Edge(Windows):

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.774.63 Safari/537.36 Edg/89.0.774.63

这仨是常用的,谷歌 火狐  Edge, 我这里使用的是edge 

headers_list =  {'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.774.63 Safari/537.36 Edg/89.0.774.63'}

 找到网页后就可以扒内容了

with open('response.txt', 'w', encoding='utf-8') as file:  

    for headers in headers:  

        # 发送请求  

        response = requests.get(url, headers=headers)  

        # 打印状态码  

        print(f'Sent request with header: {headers["User-Agent"]}, Status code: {response.status_code}')  

        # 如果请求成功,保存返回内容  

        if response.status_code == 200:  

            file.write(f'Response with header: {headers["User-Agent"]}\n')  

            file.write(response.text )  

        else:  

            file.write(f'Failed request with header: {headers["User-Agent"]}, Status code: {response.status_code}')  

print('请求成功!')

完整代码如下

import requests  

# 定义要访问的URL  
url = 'http://baidu.com'  # 请替换为你要访问的网站  

# 定义User-Agent头  
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36 Edg/126.0.0.0',
}
# 创建一个TXT文件来保存返回的内容  
with open('response.txt', 'w', encoding='utf-8') as file:  
    for headers in headers:  
        # 发送请求  
        response = requests.get(url, headers=headers)  
        
        # 打印状态码  
        print(f'Sent request with header: {headers["User-Agent"]}, Status code: {response.status_code}')  
        
        # 如果请求成功,保存返回内容  
        if response.status_code == 200:  
            file.write(f'Response with header: {headers["User-Agent"]}\n')  
            file.write(response.text)  
        else:  
            file.write(f'Failed request with header: {headers["User-Agent"]}, Status code: {response.status_code}')  

print('请求成功!')  

 结果如下

 

 文本如下


原文地址:https://blog.csdn.net/2302_80097039/article/details/140577602

免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!