正则表达式不是最好的工具。我会使用html解析器。示例BeautifulSoup:pip install beautifulsoup4
然后做
from bs4 import BeautifulSoup
raw_1 = '''
<div class="textbkStyle">Renewal/Expiration Date:
<div class="responseText">
01/01/2019
</div>
</div>
'''
raw_2 = '''
div class="textbkStyle">Renewal/Expiration Date:
<div class="responseText">
NOT AVAILABLE
</div>
</div>
'''
soup = BeautifulSoup(raw_1,'html.parser')
print(soup.find('div',{'class':'responseText'}).getText(strip=True))
soup_2 = BeautifulSoup(raw_2,'html.parser')
print(soup_2.find('div',{'class':'responseText'}).getText(strip=True))
或功能:
def get_response_text(raw):
soup = BeautifulSoup(raw,'html.parser')
tag = soup.find('div',{'class':'responseText'})
return tag.getText(strip=True)
print(get_response_text(raw_1))
print(get_response_text(raw_2))
,
尽管您不应该这样做,但是可以按照以下步骤操作:
<div class=\"textbkStyle\">Renewal/Expiration Date:\s*<div class=\"responseText\">\s*(\d{2}/\d{2}/\d{4})\s*</div>\s*</div>
您的日期将显示在\1
中
said
本文链接:https://www.f2er.com/3164786.html