有没有一种方法可以将先前的值连续存储并在满足新条件时进行更改?

我有一个数据集,其中每个ID均以数据时间和值作为列。我对此进行了一些计算,但是在使用递归函数时遇到了麻烦。

  

这是数据集的外观

Date-Time     Volume      ID    Load  
10/22/2019     3862       10        
10/23/2019     3800       10        
10/24/2019     3700       10        
10/25/2019     5000       10     Yes   
10/26/2019     4900       10        
10/27/2019     4800       10        
10/22/2019     3862       11        
10/23/2019     3800       11        
10/24/2019     3700       11        
10/25/2019     5000       11     Yes        
10/26/2019     4900       11        
10/27/2019     4800       11           

我在另一个函数中循环了ID并进行了调用。

  

这是我尝试过的,

curr_load = 0
def Load_number(data):
    global curr_load
    if(data['Load'] == 'Load'):
        curr_load = curr_load + 1       

    return curr_load
ids = unique(data)
    newdata = pd.DataFrame()
    for id in ids: 
        data = data.loc[data['ID'] == id]        
        data = calculations(data)
def calculations(data):
    data['Load_number'] = data.apply(Load_number,axis = 1)
  

必需的输出是

Date-Time     Volume      ID    Load    Load_number
10/22/2019     3862       100                0
10/23/2019     3800       100                0
10/24/2019     3700       100                0
10/25/2019     5000       100     Yes        1
10/26/2019     4900       100                1
10/27/2019     4800       100                1
10/28/2019     4700       100                1
10/22/2019     3862       111                0
10/23/2019     3800       111                0
10/24/2019     3700       111                0
10/25/2019     5000       111     Yes        1
10/26/2019     4900       111                1
10/27/2019     5800       111     Yes        2   
10/28/2019     5500       111                2     
10/29/2019     50000      111                2     

日期为

Date-Time  Volume  ID  Load        LoadDate
10/22/2019    3862  10  None          0
10/23/2019    3800  10  None          0
10/24/2019    3700  10  None          0
10/25/2019    5000  10   Yes        10/25/2019
10/26/2019    4900  10  None        10/25/2019
10/27/2019    4800  10  None        10/25/2019
10/22/2019    3862  11  None           0
10/23/2019    3800  11  None           0
10/24/2019    3700  11  None           0
10/25/2019    5000  11   Yes        10/25/2019
10/26/2019    4900  11  None        10/25/2019
10/27/2019    4800  11  None        10/25/2019
qilinz 回答:有没有一种方法可以将先前的值连续存储并在满足新条件时进行更改?

应该这样做:

df['Load Number'] = np.where(df.Load == 'Yes',1,0) 

df['Load Number'] = df.groupby('ID')['Load Number'].cumsum()    

(编辑) 关于第二个问题,您可以使用类似的方法:

df['LoadDate'] = np.where(df.Load == 'Yes',df['Date-Time'],np.nan)

df['LoadDate'] = df.groupby('ID')['LoadDate'].ffill().fillna(0) 

输出:

     Date-Time  Volume  ID  Load  Load Number    LoadDate
0   10/22/2019    3862  10  None            0           0
1   10/23/2019    3800  10  None            0           0
2   10/24/2019    3700  10  None            0           0
3   10/25/2019    5000  10   Yes            1  10/25/2019
4   10/26/2019    4900  10  None            1  10/25/2019
5   10/27/2019    4800  10  None            1  10/25/2019
6   10/22/2019    3862  11  None            0           0
7   10/23/2019    3800  11  None            0           0
8   10/24/2019    3700  11  None            0           0
9   10/25/2019    5000  11   Yes            1  10/25/2019
10  10/26/2019    4900  11  None            1  10/25/2019
11  10/27/2019    4800  11  None            1  10/25/2019
本文链接:https://www.f2er.com/3135062.html

大家都在问