我正在使用Pandas USFederalHolidayCalendar和pandas.tseries.offsets一起导入CustomBusinessDay,以便以日期时间格式从Dataframe列中获取两个日期之间的天数。但是,当我注意到有时日期之间的差异不存在时,请不确定该功能是否未考虑假期和/或周末的数量。请参阅下文,让我知道我是否在做任何错误或可以更改以避免此问题的方法。在此示例中,函数唯一未提供正确答案的时间是在第0行。在月末到2020-01-03之间,有20个工作日-['redepdays']。是我应该根据自己的功能进行更改的东西。
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
us_bd = CustomBusinessDay(calendar=USFederalHolidayCalendar())
forecast ['conus_days'] = np.where(forecast['Month']==forecast["conus_mth"],forecast['Start Date'] - forecast['start_month'].apply(us_bd),0)
forecast['conus_days']=forecast['conus_days'].dt.days
forecast ['conus_days1'] = np.where(forecast['Month']==forecast["conus_mth"],forecast['end_month'] - forecast['Start Date'],0)
forecast['conus_days1']=forecast['conus_days1'].dt.days
forecast['oconus_days'] = np.where(forecast['Month']==forecast['oconus_mth'],(forecast['End Date'] - forecast['start_month']),0)
forecast['oconus_days']=forecast['oconus_days'].dt.days
forecast['redepdays'] = np.where(forecast['Month']==forecast['oconus_mth'],(forecast['end_month'] - forecast['End Date'].apply(us_bd)),0)
forecast['redepdays']=forecast['redepdays'].dt.days
print (forecast)
Name EID Start Date End Date Country year Month \
0 GP 123456 2019-08-01 2020-01-03 Afghanistan 2020 1
1 MW 3456789 2019-09-22 2020-02-16 Conus 2020 1
2 MH 456789 2019-12-05 2020-03-12 Conus 2020 1
3 DR 789456 2019-09-11 2020-03-04 Iraq 2020 1
4 JR 985756 2020-01-03 2020-05-06 Germany 2020 1
days_in_month start_month end_month working_days conus_mth oconus_mth \
0 31 2020-01-01 2020-01-31 21 8 1
1 31 2020-01-01 2020-01-31 21 9 2
2 31 2020-01-01 2020-01-31 21 12 3
3 31 2020-01-01 2020-01-31 21 9 3
4 31 2020-01-01 2020-01-31 21 1 5
conus_days conus_days1 oconus_days redepdays
0 0 0 2 25
1 0 0 0 0
2 0 0 0 0
3 0 0 0 0
4 1 28 0 0