Python语言学习:fake_useragent 自定义 user-agent 的利器

在做爬虫的过程中,我们有时候会需要自定义 user-agent,或者随机 user-agent 的需求
更新于: 2022-03-03 00:38:02

安装 fake_useragent

pip install fake_useragent -U

查看当前 fake_useragent 版本

import fake_useragent

print(fake_useragent.VERSION)
# 0.1.11

随机产生一个 user-agent

from fake_useragent import UserAgent

# 实例化 user-agent 对象
ua = UserAgent()

for i in range(10):
    print(ua.random)

产生随机,但都来自来 google chrome

可以使用的 key 列表
from fake_useragent import UserAgent

ua = UserAgent()

for i in range(10):
	# 正常的情况
    print(ua['google chrome'])
    # 或者简写
    print(ua.chrome)

更多其它浏览器

import fake_useragent

# 实例化 user-agent 对象
ua = fake_useragent.UserAgent()

# ua.ie
print(ua.ie)  # Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; chromeframe/13.0.782.215)

# ua.msie
print(ua['Internet Explorer'])  # Mozilla/5.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.2; SLCC1; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727)

# ua.opera
print(ua.opera)  # Opera/9.80 (Windows NT 6.1; U; en-US) Presto/2.7.62 Version/11.01

# ua.chrome
print(ua.chrome)  # Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.16 Safari/537.36

# ua.google
print(ua['google chrome'])  # Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36

# ua.firefox
print(ua.firefox)  # Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:16.0.1) Gecko/20121011 Firefox/21.0.1

# ua.ff
print(ua.ff)  # Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/29.0

# ua.safari
print(ua.safari)  # Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-TW) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5

user-agent 的缓存文件

import fake_useragent
print(fake_useragent.settings.CACHE_SERVER)
'''
# 网址,其实是个json文件
https://fake-useragent.herokuapp.com/browsers/0.1.11
缓存文件的json

最后给一段测试代码

测试你的 user-agent 是否设置成功。

import requests
import fake_useragent

# request with headers
ua = fake_useragent.UserAgent()
headers = {'User-Agent': ua.chrome}
res = requests.get('http://httpbin.org/get', headers=headers)
print(res.text)

未设置 user-agent 的效果

{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.27.1", 
    "X-Amzn-Trace-Id": "Root=1-62200944-78d9c23935454cc5110cbb5e"
  }, 
  "origin": "112.120.33.252", 
  "url": "http://httpbin.org/get"
}

设置了  user-agent 的效果

{
  "args": {},
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Host": "httpbin.org", 
    "User-Agent": "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1866.237 Safari/537.36", 
    "X-Amzn-Trace-Id": "Root=1-622008fa-5856ab4331fb35835a5ec689"
  }, 
  "origin": "112.120.33.252", 
  "url": "http://httpbin.org/get"
}

参考