无法在python webscrape中擦除所有UL标签的文本

我是python webscraping的新手,出于实践目的,我尝试抓取维基百科的报价页面之一。

维基百科页面的

Link

我尝试过的代码:

from bs4 import BeautifulSoup
from urllib.request import Request,urlopen
import re


req = Request('https://en.wikiquote.org/wiki/India',headers={'User-Agent': 
      'Mozilla/5.0'})
webpage = urlopen(req).read()
html = BeautifulSoup(webpage,'html.parser')
quotes = html.find('ul').findAll("b")
print(quotes)

我得到了第一条报价,但我希望页面上所有的报价。

任何人都可以提供解决方案吗? TIA!

seraph59 回答:无法在python webscrape中擦除所有UL标签的文本

您必须使用findAll来获取所有ul,然后从每个文本中提取文本:

from bs4 import BeautifulSoup
from urllib.request import Request,urlopen
import re


req = Request('https://en.wikiquote.org/wiki/India',headers={'User-Agent': 
      'Mozilla/5.0'})
webpage = urlopen(req).read()
html = BeautifulSoup(webpage,'html.parser')
quotes = html.findAll('ul')
for quote in quotes:
    print(quote.get_text())

结果:

In India I found a race of mortals living upon the Earth,but not adhering to it. Inhabiting cities,but not being fixed to them,possessing everything but possessed by nothing.
Apollonius of Tyana,quoted in The Transition to a Global Society (1991) by Kishor Gandhi,p. 17,and in The Age of Elephants (2006) by Peter Moss,p. v
Apollonius of Tyana,p. v
This also is remarkable in India,that all Indians are free,and no Indian at all is a slave. In this the Indians agree with the Lacedaemonians. Yet the Lacedaemonians have Helots for slaves,who perform the duties of slaves; but the Indians have no slaves at all,much less is any Indian a slave.
Arrian,Anabasis Alexandri,Book VII : Indica,as translated by Edgar Iliff Robson (1929),p. 335
Arrian,p. 335
No Indian ever went outside his own country on a warlike expedition,so righteous were they.
Arrian,p. 18
Arrian,p. 18
India of the ages is not dead nor has She spoken her last creative word; She lives and has still something to do for herself and the human peoples. And that which must seek now to awake is not an Anglicized oriental people,docile pupil of the West and doomed to repeat the cycle of the Occident's success and failure,but still the ancient immemorial Shakti recovering Her deepest self,lifting Her head higher toward the supreme source of light and strength and turning to discover the complete meaning and a vaster form of her Dharma.
Sri Aurobindo,in the last issue of Arya: A Philosophical Review (January 1921),as quoted in The Modern Review,Vol. 29 (1921),p. 626.
Sri Aurobindo,p. 626.
For what is a nation? What is our mother-country? It is not a piece of earth,nor a figure of speech,nor a fiction of the mind. It is a mighty Shakti,composed of the Shaktis of all the millions of units that make up the nation,just as Bhawani Mahisha Mardini sprang into being from the Shaktis of all the millions of gods assembled in one mass of force and welded into unity. The Shakti we call India,Bhawani Bharati,is the living unity of the Shaktis of three hundred million people …
Sri Aurobindo (Bhawāni Mandir) quoted in Issues of Identity in Indian English Fiction: A Close Reading of Canonical Indian English Novels by H. S. Komalesha
Sri Aurobindo (Bhawāni Mandir) quoted in Issues of Identity in Indian English Fiction: A Close Reading of Canonical Indian English Novels by H. S. Komalesha
India is the guru of the nations,the physician of the human soul in its profounder maladies; she is destined once more to remould the life of the world and restore the peace of the human spirit. But Swaraj is the necessary condition of her work and before she can do the work,she must fulfil the condition.
Sri Aurobindo,Sri Aurobindo Mandir Annual (1947),p. 196
Sri Aurobindo,p. 196
...
本文链接:https://www.f2er.com/3147039.html

大家都在问