尝试阈值时为空的熊猫数据框

我正在尝试对包含基因ID和统计信息的熊猫数据框进行阈值处理。我的python程序的输入是config.yaml文件,其中包含初始阈值和CSV文件的路径(最终为数据框)。我似乎遇到的问题是由于将阈值变量传递到“缩减”数据帧而引起的。使用整数值(不建议使用的方法)时,我能够成功阈值,但是尝试使用指向配置文件中值的变量进行阈值时,我收到一个空的数据框。

以下是我当前的实现方式:

    config = yaml.full_load(file)
    # for item,doc in config.items():
    # print (item,":",doc)
    input_path = config['DESeq_input']['path']
    # print(input_path)
    baseMean = config['baseMean']
    log2FoldChange = config['log2FoldChange']
    lfcSE = config['lfcSE']
    pvalue = config['pvalue']
    padj = config['padj']
    df = pd.read_csv(input_path)
    # print if 0 < than padj for test
    # convert to #,most likely being read as string
    # now use threshold value to cut down CSV
    # only columns defined in config.yaml file
    df_select = df[['genes','baseMean','log2FoldChange','lfcSE','pvalue','padj']]
    # print(df_select)
    # print(df_select['genes'])
    df_threshold = df_select.loc[(df_select['baseMean'] < baseMean)
                                     & (df_select['log2FoldChange'] < log2FoldChange)
                                     & (df_select['lfcSE'] < lfcSE)
                                     & (df_select['pvalue'] < pvalue)
                                     & (df_select['padj'] < padj)]
    print(df_threshold)

下面是我的(不建议使用的)实现(有效):

df = pd.read_csv('/Users/nmaki/Documents/GitHub/IDEA/tests/eDESeq2.csv')
df_select = df[['genes','padj','log2FoldChange']]
df_threshold = df_select.loc[(df_select['pvalue'] < 0.05) 
                           & (df_select['padj'] < 0.1) 
                           & (df_select['log2FoldChange'] < 0.5)]
print(df_threshold)

执行我当前的代码后得到:

空DataFrame

列:[genes,baseMean,log2FoldChange,lfcSE,pvalue,padj]

索引:[]

我作为数据帧加载的csv文件的示例内容:

"genes","baseMean","log2FoldChange","lfcSE","stat","pvalue","padj"
"ENSDARG00000000001",98.1095154977918,-0.134947665995593,0.306793322887575,-0.439865068527078,0.660034837008121,0.93904992415549
"ENSDARG00000000002",731.125841719954,0.666095249996351,0.161764851506172,4.11767602043598,3.82712199388831e-05,0.00235539468663284
"ENSDARG00000000018",367.699187187462,-0.170546910862128,0.147128047078344,-1.1591733476304,0.246385533026112,0.756573630543937
"ENSDARG00000000019",1133.08821430092,-0.131148919306121,0.104742185100469,-1.25211173683576,0.210529151546469,0.718240791187956
"ENSDARG00000000068",397.13408030651,-0.111332941901299,0.161417383863387,-0.689720891496564,0.49036972534723,0.8864754582597
"ENSDARG00000000069",1886.21783387126,-0.107901197025113,0.113522109960702,-0.950486183374019,0.341865271089735,0.82295928359482
"ENSDARG00000000086",246.197553048504,0.390421091410488,0.215725761369183,1.80980282063921,0.0703263703690051,0.466064880589034
"ENSDARG00000000103",797.782152145232,0.236382332789599,0.145111727277908,1.62896781138092,0.103319833277229,0.550658656731341
"ENSDARG00000000142",26.1411622212853,0.248419645848534,0.495298350652519,0.501555568519983,0.615980180267141,0.927327861190167
"ENSDARG00000000151",121.397701922367,0.276123125224845,0.244276041791451,1.13037333993066,0.25831894300396,0.766841249972654
"ENSDARG00000000161",22.2863001989718,0.837640942615127,0.542200061816621,1.54489274643135,0.122372208261173,0.587106227452529
"ENSDARG00000000183",215.47910609869,0.567221763062732,0.188807351259458,3.00423558340829,0.00266249076445763,0.0615311290935424
"ENSDARG00000000189",620.819069705942,0.0525797819665496,0.142171888686286,0.369832478504743,0.711507313969775,0.950479626809728
"ENSDARG00000000212",54472.1417532637,0.344813324409911,0.130070467015575,2.65097321722249,0.00802602056136946,0.132041563800088
"ENSDARG00000000229",172.985864037855,-0.0814838221355631,0.22200915791162,-0.367029103222856,0.713597309421024,0.95157821096128
"ENSDARG00000000241",511.449190233542,-0.431854805500191,0.157764756166574,-2.73733383801019,0.0061939401710654,0.114238610824236
"ENSDARG00000000324",179.189751392247,0.0141623609187069,0.206197755704643,0.0686833902256096,0.945241639658214,0.992706066946251
"ENSDARG00000000349",13.6578995386995,0.86981405362392,0.716688718472183,1.21365668414338,0.224878851627296,0.731932542953245
"ENSDARG00000000369",9.43959070533812,-0.042383076946964,0.868977019485631,-0.0487735302506061,0.961099776861288,NA
"ENSDARG00000000370",129.006520833067,0.619490133053518,0.250960632807829,2.46847533863165,0.0135690001510168,0.184768676917612
"ENSDARG00000000380",17.695581482726,-0.638493654324115,0.597289695632778,-1.06898488119351,0.285076482019819,0.786103920659844
"ENSDARG00000000394",2200.41651475378,-0.00605761754099435,0.0915611724486909,-0.0661592395443486,0.947251047773153,0.992978480118812
"ENSDARG00000000423",195.477813443242,-0.18634265895713,0.188820984694016,-0.986874733542448,0.323704052061987,0.810439992736898
"ENSDARG00000000442",1102.47980192551,0.0589654622770368,0.112333519273845,0.524914225586502,0.599642819781172,0.920807266898811
"ENSDARG00000000460",8.52822266110357,0.229130838495461,0.957763036484278,0.239235416034165,0.810923041830713,NA
"ENSDARG00000000472",0.840917787550721,-0.4234502342491,3.1634759582284,-0.133855998857105,0.893516444899853,NA
"ENSDARG00000000474",5.12612778660879,0.394871266508097,1.07671345623418,0.366737560696199,0.713814786364707,NA
"ENSDARG00000000476",75.8417047936895,0.242006157627571,0.349451220882324,0.692532013528336,0.488603288756242,0.885874315527816
"ENSDARG00000000489",1233.33364888202,0.0676458807753533,0.131846296650645,0.513066217965876,0.607905001380741,0.924392802283811
ief111111 回答:尝试阈值时为空的熊猫数据框

事实证明,我的阈值过于严格(我添加了两个其他变量,这些变量在我的原始实现中不存在)。我现在收到一个填充的数据框。

本文链接:https://www.f2er.com/3070333.html

大家都在问