使用Python和BeautifulSoup生成Yahoo新闻和Bing新闻的URL

我想从Yahoo News和“必应新闻”页面中抓取数据。我要抓取的数据是标题或/和标题下方的文本(可以抓取的内容)以及发布时的日期(时间)。

我已经写了一个代码,但是没有返回任何内容。自从我得到url

以来,这就是我的response 404的问题

您能帮我吗?

这是“ Bing”的代码

from bs4 import BeautifulSoup
import requests

term = 'usa'
url = 'http://www.bing.com/news/q?s={}'.format(term)

response = requests.get(url)
print(response)

soup = BeautifulSoup(response.text,'html.parser')
print(soup)

这是给Yahoo的:

term = 'usa'

url = 'http://news.search.yahoo.com/q?s={}'.format(term)

response = requests.get(url)
print(response)

soup = BeautifulSoup(response.text,'html.parser')
print(soup)

请帮助我生成这些网址,它们背后的逻辑是什么,我仍然是一个菜鸟:)

wdd16617723 回答:使用Python和BeautifulSoup生成Yahoo新闻和Bing新闻的URL

基本上,您的网址是错误的。您必须使用的URL与使用常规浏览器时在地址栏中找到的URL相同。通常,大多数搜索引擎和聚合器都使用q参数作为搜索词。通常不需要其他大多数参数(有时是必需的-例如,用于指定结果页面编号等)。

必应

from bs4 import BeautifulSoup
import requests
import re
term = 'usa'
url = 'https://www.bing.com/news/search?q={}'.format(term)
response = requests.get(url)
soup = BeautifulSoup(response.text,'html.parser')
for news_card in soup.find_all('div',class_="news-card-body"):
    title = news_card.find('a',class_="title").text
    time = news_card.find(
        'span',attrs={'aria-label': re.compile(".*ago$")}
    ).text
    print("{} ({})".format(title,time))

输出

Jason Mohammed blitzkrieg sinks USA (17h)
USA Swimming held not liable by California jury in sexual abuse case (1d)
United States 4-1 Canada: USA secure payback in Nations League (1d)
USA always plays the Dalai Lama card in dealing with China,says Chinese Professor (1d)
...

雅虎

from bs4 import BeautifulSoup
import requests
term = 'usa'
url = 'https://news.search.yahoo.com/search?q={}'.format(term)
response = requests.get(url)
soup = BeautifulSoup(response.text,'html.parser')
for news_item in soup.find_all('div',class_='NewsArticle'):
    title = news_item.find('h4').text
    time = news_item.find('span',class_='fc-2nd').text
    # Clean time text
    time = time.replace('·','').strip()
    print("{} ({})".format(title,time))

输出

USA Baseball will return to Arizona for second Olympic qualifying chance (52 minutes ago)
Prized White Sox prospect Andrew Vaughn wraps up stint with USA Baseball (28 minutes ago)
Mexico defeats USA in extras for Olympic berth (13 hours ago)
...
本文链接:https://www.f2er.com/3085413.html

大家都在问