Jstor Web刮回空

我正在尝试从jstor中收集文章标题。但是,我总是得到空洞的回报。 Here is an example of the HTML我想抓的是,我的目标是返回突出显示的行:“降低法院对最高法院判决的反应:定量审查”。但是,当我运行代码时,每次都什么也没得到。怎么了?

import requests
from bs4 import BeautifulSoup

main1= "https://www.jstor.org/stable/i310437"
page = requests.get(main1)
soup = BeautifulSoup(page.content,'html.parser')
results = soup.findAll('span',class_="show-for-sr")

for result in results:
    print(result.text)
iCMS 回答:Jstor Web刮回空

您必须添加用户代理以获取必需的信息:

.......
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/83.0.4103.116 Safari/537.36',}

......
page = requests.get(main1,headers=headers)
......

输出示例:

Volume Information
Front Matter
Note: Presidential Coattails Revisted
Oiling the Tax Committees in Congress,1900-1974: Subgovernment Theory,the Overrepresentation Hypothesis,and the Oil Depletion Allowance
A Test of the Revolving Door Hypothesis at the FCC
Mao's Concept of Representation
,

您需要在标题中将User-Agent发送到会话中,我建议这样做:

import requests
from bs4 import BeautifulSoup
headers = {'User-Agent':'Mozilla/5.0'}
s = requests.Session()
s.get('https://www.jstor.org',headers=headers)
r = s.get('https://www.jstor.org/stable/i310437',headers=headers)
soup = BeautifulSoup(r.text,'lxml')
titles = soup.select('.show-for-sr')
for title in titles:
    print(title.get_text(strip=True)) 

输出:

Volume Information
Front Matter
Note: Presidential Coattails Revisted
Oiling the Tax Committees in Congress,and the Oil Depletion Allowance
A Test of the Revolving Door Hypothesis at the FCC
Mao's Concept of Representation
The Mass Public and Macroeconomic Performance: The Dynamics of Public Opinion Toward Unemployment and Inflation
Assessing the Candidate Preference Function
Another Look at the Life Cycle and Political Participation
Stratification and the Dimensions of American Political Orientations
Lower Court Reactions to Supreme Court Decisions: A Quantitative Examination
What They Don't Know Can Hurt You
Interpreting Heteroscedasticity
Back Matter

请注意,您需要为此代码安装lxml。

pip install lxml

如果尚未安装。

本文链接:https://www.f2er.com/1934574.html

大家都在问