将GET请求的内容转换为BeautifulSoup时，它会弄乱

2024-05-09 • 问答

当我尝试抓取网站时（在本例中为Amazon，但在许多其他网站中也是如此），当查看它时，GET请求的内容会很好。

print(response.content)

但是将其转换为BeautifulSoup对象时， / body 和 / html 标签会跳起来。

<html>
    <head>
        ...
    </head>
    <body>
        ...
    </body>
</html>
...                # more content that needs to go in the body

编辑：这是代码：

import requests
from bs4 import BeautifulSoup

myUserAgent = {
        "User-Agent":
         'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/76.0.3809.132 Safari/537.36'
    }

URL = 'https://www.amazon.com/AOC-I1659FWUX-USB-Powered-Portable-1920x1080/dp/B06Y8SSQG5'

response = requests.get(URL,headers=myUserAgent)
soup = BeautifulSoup(response.content,'html.parser')

将GET请求的内容转换为BeautifulSoup时，它会弄乱

baobeisl521 回答：将GET请求的内容转换为BeautifulSoup时，它会弄乱

大家都在问