我是刚接触熊猫的新手,我正尝试根据Spotfire计算的列公式添加带有groupby()的列
假设我有一个包含以下数据(df1)的表:
'Well ID','Assay','Source','Treat','BkgrdSub Fluorescence','Calced'
'A1',4,'Source 1','OPA',-215.75,0.035583351
'A2',-160.75,0.130472288
'A3',343.25,1
'H10',6,'OPP',9896,1
'H11',9892,0.999605226
'H12','CN',-1,1
'A1','Source 2',-170,0.03682641
'A2',-86,0.083431583
'A3',1566,'ZI',4885,0.809271732
'H11',6092,1
'H12',78,'Source 3',-114.5,0.037329147
'A2',0.037329147
'A3',3028.5,'ZIII',4245.375,0.85305734
'H11',5017.375,20.375,'Source 4',-183.375,0.017731683
'A2',-102.375,0.044831047
'A3',2752.625,'ZIIII',2635.75,0.697943562
'H11',3878.75,-10.25,'Source 5',-236.375,0
'A2',-199.375,0.028094153
'A3',1080.625,'ZV',3489,0.952202946
'H11',3676,31,'Source 6',-221.375,0.008870491
'A2',-150.375,0.050857481
'A3',1454.625,'ZVI',2224.375,1418.375,0.672457584
'H12',716.375,1
我希望能够添加一个由Spotfire公式定义的计算列:
([BkgrdSub Fluorescence] - Min([BkgrdSub Fluorescence])) / Max([BkgrdSub Fluorescence] - Min([BkgrdSub Fluorescence])) OVER ([Treat],[Source],[Assay])
我一次只创建了一个脚本,而现在我试图通过groupby()运行它:
import pandas as pd
df1.insert(5,"Scaled BckgrdSub Fluorescence min","")
df1['Scaled BckgrdSub Fluorescence min'] = df1.groupby(['Treat','Assay'])['BkgrdSub Fluorescence'].transform('min')
df1.insert(6,"Scaled BckgrdSub Fluorescence eq","")
df1['Scaled BckgrdSub Fluorescence eq'] = df1[['BkgrdSub Fluorescence'] - ['Scaled BckgrdSub Fluorescence min']].groupby(df1['Treat'],df1['Source'],df1['Assay']).transform('max')
但是我得到了错误:
TypeError: unsupported operand type(s) for -: 'list' and 'list'
据我所知,这意味着我无法从列表中减去列表。因此,很明显,该语法不支持groupby()函数中的方程式。
我还尝试通过避免使用groupby()来避免此语法错误,其中“ Scaled BckgrdSub Fluorescence”是所需的结果列:
df1.insert(5,"")
df1['Scaled BckgrdSub Fluorescence eq'] = df1['BkgrdSub Fluorescence'] - df1['Scaled BckgrdSub Fluorescence min']
df1.insert(7,"Scaled BckgrdSub Fluorescence max","")
df1['Scaled BckgrdSub Fluorescence max'] = df1.groupby(['Treat','Assay'])['Scaled BckgrdSub Fluorescence eq'].transform('max')
df1.insert(8,"Scaled BckgrdSub Fluorescence","")
df1['Scaled BckgrdSub Fluorescence'] = df1['Scaled BckgrdSub Fluorescence eq'] / df1['Scaled BckgrdSub Fluorescence max']
但是,这与您在Spotfire中获得的计算列不同。
从Spotfire获取的计算列的预期输出已经显示在“计算”列中。
所以我的问题是,有没有一种简单的方法可以在几行中用groupby()函数添加我想要的列,同时又保持准确?