使用BeautifulSoup搜寻Quora问答

我用于搜寻Quora问题的代码如下:

import requests
from bs4 import BeautifulSoup
import pandas as pd

URL = "https://www.quora.com/What-is-the-best-workout-1"

page = requests.get(URL)

soup = BeautifulSoup(page.text,"html.parser")

print(soup.find_all("span",{"class": "q-box qu-userSelect--text"}))

结果是一个空列表。

问题是page.text不包含与我在Quora上检查元素时得到的源代码相同的源代码。

它包含以下text,其中不包含任何<span>元素

这是我在使用Inspect Element时获得的代码

iCMS 回答:使用BeautifulSoup搜寻Quora问答

尝试:

from selenium import webdriver
import time

driver = webdriver.Firefox(executable_path='c:/program/geckodriver.exe')

URL = "https://www.quora.com/What-is-the-best-workout-1"
driver.get(URL)



PAUSE_TIME = 2


lh = driver.execute_script("return document.body.scrollHeight")

while True:

    driver.execute_script("window.scrollTo(0,document.body.scrollHeight);")
    time.sleep(PAUSE_TIME)
    nh = driver.execute_script("return document.body.scrollHeight")
    if nh == lh:
        break
    lh = nh
spans = driver.find_elements_by_css_selector('span.q-box.qu-userSelect--text')
for span in spans:
    print(span.text)
    print('-' * 80)

打印:

What is the best workout?
--------------------------------------------------------------------------------
The best workout is the one you don't skip.
Look,you can discuss sets and reps,crossfit and powerlifting,diet and supplements endlessly. And there is some value in it,if only just for entertainment sometimes (especially on the internet). But let's just get one thing straight here - if you are doing any kind of workout then it's going to have a greater impact than if you weren't. Simple as that.
Of course there are caveats. You don't want to get hurt,so they can pretty much all be summed up into one commandment: Thou shalt not be an idiot. Getting under a bar loaded with 495 lbs and squattin
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
What are some at-home workouts?
--------------------------------------------------------------------------------
Gyms are closed here because of the Coronavirus. What are your top 3 bodyweight exercises for building muscle?
--------------------------------------------------------------------------------
What is the best body weight workout routine?
--------------------------------------------------------------------------------

以此类推...

我不确定您确实要使用q-box qu-userSelect--text。但这正是您要的。

注意硒:您需要seleniumgeckodriver,并且在此代码中,将geckodriver设置为从c:/program/geckodriver.exe导入

本文链接:https://www.f2er.com/1806416.html

大家都在问