此脚本的目的是在电子邮件中查找超链接并自动打开它们。我目前停留在搜索部分。
该脚本似乎无法从电子邮件正文中提取链接。超链接应类似于
https://something.com/verify/c4b7668ad547922226426896f
我的正则表达式有问题吗?
def process_mailbox(M):
rv,data = M.search(None,specific_email_addy)
if rv != 'OK':
print "No messages found!"
return
for num in data[0].split():
rv,data = M.fetch(num,'(RFC822)')
if rv != 'OK':
print "ERROR getting message",num
return
msg = email.message_from_string(data[0][1])
raw_email = data[0][1] # here's the body,which is raw headers and html and body of the whole email including headers and alternate payloads
msg = email.message_from_string(raw_email)
for part in msg.walk():
# each part is a either non-multipart,or another multipart message
# that contains further parts... Message is organized like a tree
if part.get_content_type() == 'text/html':
plain_text = part.get_payload()
link_pattern = re.compile('<a[^>]+href=\'(.*?)\'[^>]*>(.*)?</a>')
search = link_pattern.search(plain_text)
if search is not None:
print("Link found! -> " + search)
break
else:
print("No links were found.")