我试图通过使用str.contains()
和np.where()
函数赋予多个包含条件的字符串来添加新列。通过这种方式,我可以得到想要的最终结果。
但是,代码很长。有什么好的方法可以使用pandas函数重新实现它?
df5['new_column'] = np.where(df5['sr_description'].str.contains('gross to net',case=False).fillna(False),1,np.where(df5['sr_description'].str.contains('gross up',np.where(df5['sr_description'].str.contains('net to gross',np.where(df5['sr_description'].str.contains('gross-to-net',np.where(df5['sr_description'].str.contains('gross-up',np.where(df5['sr_description'].str.contains('net-to-gross',np.where(df5['sr_description'].str.contains('gross 2 net',np.where(df5['sr_description'].str.contains('net 2 gross',np.where(df5['sr_description'].str.contains('gross net',np.where(df5['sr_description'].str.contains('net gross',np.where(df5['sr_description'].str.contains('memo code',0)))))))))))
此输出将是
如果这些字符串包含在“ sr_description”中,则给1
,否则给0
到new_column
也许将多个字符串条件存储在列表中,然后读取并将它们应用于函数。
编辑:
样本数据:
sr_description new_column
something with gross up. 1
without those words. 0
or with Net to gross 1
if not then we give a '0' 0