用 python 编写 http request 消息代码时,建议用requests库。因为requests比urllib更为简捷,requests可以直接构造get,post请求并发送,而urllib.request只能先构造get,post请求消息内容,然后再发送。并且requests 模块提供了更友好的方法与属性来解析response消息内容。
学习目标:- 生成通用的 request 请求消息
- 对 request 的header , query string, message body 定制化
- 分析request, response 消息数据
- 发送文件
- 使用json做 payload
pip install requests
http消息类型[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-GdoUvjxJ-1655861996516)(Python学习.assetsgraph-10)]
HTTP request 协议格式简介客户端请求消息
客户端发送一个HTTP请求到服务器的请求消息由四个部分组成
-
请求行(request line)
-
请求头部(header)、
-
空行
-
请求数据
下图给出了请求报文的一般格式。
服务器的响应消息, 也是由4部分组成
状态行、消息报头、空行和响应正文
用于准备并发送 http get 请求至指定url , 并返回response 对象
requests.get(url, params=None, **kwargs)
- url: 拟获取页面的url链接
- params: url中的额外参数,字典或字节流格式,可选
- **kwargs: 可选参数,共有12个控制访问的参数
url格式:http://host_ip:port/path/add?key1=value1&key2=value2
传参数用字典类型:params={ key1: value1, key2: value2 }
response = requests.get( 'https://api.github.com/search/repositories', params={'q':'requests+language:python'}, )
涉及到的问题: 如果参数中包含汉字 ,则会报编码错误,
可先进行编码, (将utf-8转为ascii码 )
urllib.parse.urlencode(dic)
解码:
urllib.parse.unquote(dic or str)
# 例1 keyword = input('请输入搜索关键字:') param_list = urllib.parse.urlencode( { 'wd' : keyword } ) header = {'user-Agent':’haha‘} url = 'http://www.baidu.com/s/' response = request.get( url, params=param_list, headers = header ) # 例2 headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36"} request.get("http://www.baidu.com",headers=headers)
cookies = {"name":"haha"} request.get("http://www.baidu.com",cookie=cookies)2、 Response 对象常用属性及方法 ![在这里插入图片描述](https://img-blog.csdnimg.cn/d3119f252db04aa0841bb3049303c181.jpeg#pic_center)
收到response后,需要分析 status_code,有必要可以对404, 500等出错响应进行特殊处理。
import requests from requests.exceptions import HTTPError for url in ['https://api.github.com', 'https://api.github.com/invalid']: try: response = requests.get(url) # If the response was successful, no Exception will be raised response.raise_for_status() except HTTPError as http_err: print(f'HTTP error occurred: {http_err}') # Python 3.6 except Exception as err: print(f'Other error occurred: {err}') # Python 3.6 else: print('Success!')
当调用 .raise_for_status(), 对特定的status_code将产生1个 HTTPError 异常
Status_code
S.N. | Code and Description |
---|---|
1 | 1xx: InformationalIt means the request was received and the process is continuing. |
2 | 2xx: SuccessIt means the action was successfully received, understood, and accepted. |
3 | 3xx: RedirectionIt means further action must be taken in order to complete the request. |
4 | 4xx: Client ErrorIt means the request contains incorrect syntax or cannot be fulfilled. |
5 | 5xx: Server ErrorIt means the server failed to fulfill an apparently valid request. |
常用方法
如果响应内 容是json格式,可以用Response.json()方法转换成 dict类型
异常 response 消息如果发出request后,收到异常的response, 可以用下面的数据检查 :
>>> response = requests.post('https://httpbin.org/post', json={'key':'value'}) >>> response.request.headers['Content-Type'] 'application/json' >>> response.request.url 'https://httpbin.org/post' >>> response.request.body b'{"key": "value"}'GET 方法的请求参数 Query String
get方法的请求参数是通过 params={ } 方式来传递的。
response = requests.get( 'https://api.github.com/search/repositories', params={'name': 'Jack','type':'display'}, )3. POST 请求 3.1 POST参数
与GET不同的是, POST 参数是放在 request ody 里发送,也称payload, 类型可以是 tuple, dict, json
# 发送字典 post_dict = {'key1': 'value1', 'key2': 'value2'} # 发送元组 post_tuple = (('key1', 'value1'), ('key1', 'value2')) # 发送json post_json = json.dumps({'some': 'data'}) r1 = requests.post("http://httpbin.org/post", data=post_dict) r2 = requests.post("http://httpbin.org/post", data=post_tuple) r3 = requests.post("http://httpbin.org/post", data=post_json
Json 格式,实际收消息
POST /echo/post/json HTTP/1.1 Authorization: Bearer mt0dgHmLJMVQhvjpNXDyA83vA_Pxh33Y Accept: application/json Content-Type: application/json Content-Length: 85 Host: reqbin.com { "Id": 12345, "Customer": "John Smith", "Quantity": 1, "Price": 10.00 }
Server Response to HTTP POST Request
HTTP/1.1 200 OK Content-Length: 19 Content-Type: application/json {"success":"true"}3.2 POST 消息设置 cookie, header
import requests # 请求数据 url = 'http://api.shein.com/v2/member/login' cookie = "token=code_space;" header = { "cookie": cookie, "Accept": "*/*", "Accept-Encoding": "gzip, deflate, br", "Accept-Language": "zh-CN,zh;q=0.9", "Connection": "keep-alive", "Content-Type": "application/json", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) " "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36" } data = { 'user_id': '123456', 'email': '123456@163.com' } timeout = 0.5 resp = requests.post(url, headers=header, data=data, timeout=timeout) print(resp.text) print(type(resp.json()))3.3 用 POST 发送文件
#形式1 url = 'http://httpbin.org/post' #定义文件对象 files = {"files":open('test.xls', 'rb')} response = requests.post(url,files = files) print(response.text) #形式2 url ='http://httpbin.org/post' files = {'file': ('t.xls', open('t.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})} r = requests.post(url, files=files) r.text #形式3, 发送多个文件 url = 'http://httpbin.org/post' files = {'file': ('t.csv', 'bb.csv')} response = requests.post(url, files=files) response.textpost请求里 content-type 的说明
Content-Type 标头告诉客户端实际返回的内容的内容类型。
语法格式:
Content-Type: text/html; charset=utf-8 Content-Type: multipart/form-data; boundary=something
“Content-Type”: “application/x-www-form-urlencoded”,
form表单数据被编码为 key/value格 式发送到服务器(表单默认的提交数据的格式)
multipart/form-data
application/json 传json数据
用于传送form + 文件 场合
text/csv 传送csv
text/html 传网页
text/plain text/xml 传文本
对于"application/x-www-form-urlencoded" 编码,如果两端都是用request编程,则不需要编解码,request 模块自动完成。
Type | Values |
---|---|
Application | application/javascript application/pdf application/xhtml+xml application/json application/ld+json application/xml application/zip application/x-www-form-urlencoded application/octet-stream : 二进制流数据(如常见的文件下载 |
Audio | audio/mpeg audio/x-ms-wma |audio audio/x-wav |
Image | image/gif image/jpeg image/png image/tiff i mage/vnd.microsoft.icon image/x-icon image/vnd.djvu image/svg+xml |
Multipart | multipart/mixed multipart/alternative multipart/related (using by MHTML (HTML mail).) multipart/form-data |
Text | text/css text/csv text/html text/plain text/xml |
Video | video/mpeg video/mp4 video/quicktime video/x-ms-wmv video/x-msvideo video/x-flv video/webm |
VND | application/vnd.oasis.opendocument.text application/vnd.oasis.opendocument.spreadsheet application/vnd.oasis.opendocument.presentation application/vnd.oasis.opendocument.graphics application/vnd.ms-excel application/vnd.openxmlformats-officedocument.spreadsheetml.sheet application/vnd.ms-powerpoint application/vnd.openxmlformats-officedocument.presentationml.presentation application/msword application/vnd.openxmlformats-officedocument.wordprocessingml.document application/vnd.mozilla.xul+xml |
payload 就是通过http Post,get发送的数据,包括请求参数,文件,图片等, 发送方可以将用json类型来准备这些数据,接收方用json解开。
这在Header 里约定好。如下, 但注意,header 不能是json格式。
POST /echo/post/json HTTP/1.1 Host: reqbin.com Accept: application/json Content-Type: application/json Content-Length: 52 { "Id": 12345 }
response内容也可以用json来发送
{ "success": "true", "data": { "TotalOrders": 100 } }5. 其它requests 方法
>>> requests.post('https://httpbin.org/post', data={'key':'value'}) >>> requests.put('https://httpbin.org/put', data={'key':'value'}) >>> requests.delete('https://httpbin.org/delete') >>> requests.head('https://httpbin.org/get') >>> requests.patch('https://httpbin.org/patch', data={'key':'value'}) >>> requests.options('https://httpbin.org/get')