在SARIMA

我对我为项目所做的时间序列预测工作有疑问。

由于时间序列具有季节性影响,因此我正在使用SARIMA(季节性ARIMA)进行模型预测。在SARIMA中,我们需要传递参数-p,d,q,P,D,Q和m。虽然可以使用自动Arima自动选择大多数这些参数,但术语“ m”定义为一年中观察到的季节性周期数(每周季节性-52,每月季节性-12等),这是我们必须手动设置的术语提供。

我们究竟如何确定这个名词?根据我的时间序列,该序列的周期似乎略有不当,即有时每两周重复一次,有时在两个月内重复一次。我们每周收集一次数据集(2年和3个月的数据共有363个数据点)。

由于选择“ m”时存在混淆,我们将其保留为52(因为每周都有数据点可用),并获得了不错的MAPE(大约9)。但是,当我将其增加到80时,MAPE进一步减小到3,并且预测的图以更好的方式跟随实际图。当我将其增加到80以上时,代码引发了一些值错误。

有人知道为什么会这样吗?

编辑此内容

我增加了测试数据集,然后增加了m的值,这很有用,但是m的值越高不一定表示MAPE值越低。 m的值80给出较低的MAPE,而值81和85给出较高的值。这似乎是随机的,但是我确信m的值对于给出预测模式非常重要。我在这里附上图片以更好地理解。

SARIMA with m value 80

SARIMA with m value 81

SARIMA with m value 85

高m值错误

ValueError                                Traceback (most recent call last)
<ipython-input-90-8517b4596f58> in <module>
----> 1 model_fit = train_auto_arima(train_df1.DAT_RATE)
      2 model_fit.fit(train_df1.values)
      3 print(f'Params - > \n aic-{model_fit.aic()},\n get_params-{model_fit.get_params()}')
      4 # forecasting
      5 test_predictions = forecast_over_test_set(model_fit,test_df1.DAT_RATE,train_df1.DAT_RATE)

<ipython-input-89-218545403b7c> in train_auto_arima(df)
     81                stepwise=False,# We are going with Parallel execution rather than step-wise approach
     82                information_criterion='bic',---> 83                trace=True,error_action='ignore')
     84 
     85     return arima

~\AppData\Roaming\Python\Python37\site-packages\pmdarima\arima\auto.py in auto_arima(y,exogenous,start_p,d,start_q,max_p,max_d,max_q,start_P,D,start_Q,max_P,max_D,max_Q,max_order,m,seasonal,stationary,information_criterion,alpha,test,seasonal_test,stepwise,n_jobs,start_params,trend,method,transparams,solver,maxiter,disp,callback,offset_test_args,seasonal_test_args,suppress_warnings,error_action,trace,random,random_state,n_fits,return_valid_fits,out_of_sample_size,scoring,scoring_args,with_intercept,sarimax_kwargs,**fit_args)
    320             if seasonal_test_args is not None else dict()
    321         D = nsdiffs(xx,m=m,test=seasonal_test,max_D=max_D,--> 322                     **seasonal_test_args)
    323 
    324         if D > 0 and exogenous is not None:

~\AppData\Roaming\Python\Python37\site-packages\pmdarima\arima\utils.py in nsdiffs(x,**kwargs)
    105 
    106     D = 0
--> 107     dodiff = testfunc(x)
    108     while dodiff == 1 and D < max_D:
    109         D += 1

~\AppData\Roaming\Python\Python37\site-packages\pmdarima\arima\seasonality.py in estimate_seasonal_differencing_term(self,x)
    456 
    457         # Get the critical value for m
--> 458         stat = self._compute_test_statistic(x)
    459         crit_val = self._calc_ocsb_crit_val(self.m)
    460         return int(stat > crit_val)

~\AppData\Roaming\Python\Python37\site-packages\pmdarima\arima\seasonality.py in _compute_test_statistic(self,x)
    417         # Compute the actual linear model used for determining the test stat
    418         try:
--> 419             regression = self._fit_ocsb(x,maxlag,maxlag)
    420         except np.linalg.LinAlgError:  # Singular matrix
    421             if crit_regression is not None:

~\AppData\Roaming\Python\Python37\site-packages\pmdarima\arima\seasonality.py in _fit_ocsb(x,lag,max_lag)
    341         y_first_order_diff = diff(x,m)
    342         y = diff(y_first_order_diff)
--> 343         ylag = OCSBTest._gen_lags(y,lag)
    344 
    345         if max_lag > 0:

~\AppData\Roaming\Python\Python37\site-packages\pmdarima\arima\seasonality.py in _gen_lags(y,max_lag,omit_na)
    334 
    335         # delegate down
--> 336         return OCSBTest._do_lag(y,omit_na)
    337 
    338     @staticmethod

~\AppData\Roaming\Python\Python37\site-packages\pmdarima\arima\seasonality.py in _do_lag(y,omit_na)
    319         # Create a 2d array of dims (n + (lag - 1),lag). This looks cryptic..
    320         # If there are tons of lags,this may not be super efficient...
--> 321         out = np.ones((n + (lag - 1),lag)) * np.nan
    322         for i in range(lag):
    323             out[i:i + n,i] = y

C:\programdata\Anaconda3\lib\site-packages\numpy\core\numeric.py in ones(shape,dtype,order)
    221 
    222     """
--> 223     a = empty(shape,order)
    224     multiarray.copyto(a,1,casting='unsafe')
    225     return a

ValueError: negative dimensions are not allowed
fhqjgfhqjg 回答:在SARIMA

暂时没有好的解决方案,如果你有好的解决方案,请发邮件至:iooj@foxmail.com
本文链接:https://www.f2er.com/3146180.html

大家都在问