Python Pandas:我需要转换pandas数据框

数据框如下所示。我已经尝试过枢轴,堆叠,拆堆。有什么方法可以实现输出

   key  attribute   text_value  numeric_value   date_value
0   1   order       NaN         NaN             10/02/19
1   1   size        NaN         43.0            NaN
2   1   weight      NaN         22.0            NaN
3   1   price       NaN         33.0            NaN
4   1   segment     product     NaN             NaN
5   2   order       NaN         NaN             11/02/19
6   2   size        NaN         34.0            NaN
7   2   weight      NaN         32.0            NaN
8   2   price       NaN         89.0            NaN
9   2   segment     customer    NaN             NaN


I need the following output

key  order      size    weight    price    segment
1    10/2/2019  43.0    22.0      33.0     product
2    11/2/2019  34.0    32.0      89.0     customer

预先感谢

myzhan 回答:Python Pandas:我需要转换pandas数据框

我相信您不想更改输出数据中的dtypes,因此可能的解决方案是分别用DataFrame.dropnaDataFrame.pivot处理每一列,然后通过concat合并在一起:

df['date_value'] = pd.to_datetime(df['date_value'])

df1 = df.dropna(subset=['text_value']).pivot('key','attribute','text_value')
df2 = df.dropna(subset=['numeric_value']).pivot('key','numeric_value')
df3 = df.dropna(subset=['date_value']).pivot('key','date_value')

df = pd.concat([df1,df2,df3],axis=1).reindex(df['attribute'].unique(),axis=1)
print (df)
attribute      order  size  weight  price   segment
key                                                
1         2019-10-02  43.0    22.0   33.0   product
2         2019-11-02  34.0    32.0   89.0  customer

print (df.dtypes)
order      datetime64[ns]
size              float64
weight            float64
price             float64
segment            object
dtype: object

旧答案-所有值都强制转换为字符串:

df['date_value'] = pd.to_datetime(df['date_value'])

df['text_value'] = df['text_value'].fillna(df['numeric_value']).fillna(df['date_value'])

df = df.pivot('key','text_value')
print (df)
attribute                order price   segment size weight
key                                                       
1          1569974400000000000    33   product   43     22
2          1572652800000000000    89  customer   34     32

print (df.dtypes)
order      object
price      object
segment    object
size       object
weight     object
dtype: object
,

这是解决方案,我想通了

attr_dict = {'order':'date_value','size':'numeric_value','weight':'numeric_value','price':'numeric_value','segment':'text_value'}

output_table = pd.DataFrame()
for attr in attr_dict.keys():
    temp = input_table[input_table['attribute'] == attr][['key',attr_dict[attr]]]
    temp.rename(columns={attr_dict[attr]:attr},inplace=True)
    output_table[attr] = list(temp.values[:,1])
output_table
本文链接:https://www.f2er.com/3166628.html

大家都在问