我有一个数据框,其中包含用户和客户代理之间的完整聊天。我只想从用户中提取消息,并使用相同的票证ID从他们创建新行:
ticket_id = pd.DataFrame(["1","2"]).rename(columns={0:"Ticket-ID"})
full_chat = pd.DataFrame([
"User foo foo foo 12:12 PM,Agent bar bar bar 12:12 PM,User foo foo 12:13
PM,Agent bar bar 12:13 PM,User foo 12:14 PM,Agent bar 12:14 PM","User bar bar bar 12:12 PM,Agent foo foo foo 12:12 PM,User bar bar 12:13
PM"
]).rename(columns={0:"Full-Chat"})
merge_chat = pd.merge(ticket_id,full_chat,left_index=True,right_index=True,how='outer')
def _split_row(text):
cleaned_text = text.lower()
lines = re.findall(r"\b\w*user\b\ (.*?)\ *\d\d:\d\d*",cleaned_text)
for line in lines:
print(line.split())
print(merge_chat["Full-Chat"].apply(_split_row))
我希望这样:
Ticket-ID Full-Chat
1 foo foo foo
1 foo foo
1 foo
2 bar bar bar
2 bar bar