具有多个变量的时间序列的递归神经网络-TensorFlow

2024-05-18 • 问答

我使用3 variables使用以前的需求来预测未来需求，但是每当我运行代码时，我的Y axis就会显示错误

如果我仅在Y axis上仅使用一个变量，则没有错误。

示例：

demandaY = bike_data[['cnt']]
n_steps = 20

for time_step in range(1,n_steps+1):
    demandaY['cnt'+str(time_step)] = demandaY[['cnt']].shift(-time_step).values

y = demandaY.iloc[:,1:].values
y = np.reshape(y,(y.shape[0],n_steps,1))

数据集

脚本

features = ['cnt','temp','hum']
demanda = bike_data[features]
n_steps = 20

for var_col in features:
    for time_step in range(1,n_steps+1):
        demanda[var_col+str(time_step)] = demanda[[var_col]].shift(-time_step).values

demanda.dropna(inplace=True)
demanda.head()

n_var = len(features)
columns = list(filter(lambda col: not(col.endswith("%d" % n_steps)),demanda.columns))

X = demanda[columns].iloc[:,:(n_steps*n_var)].values
X = np.reshape(X,(X.shape[0],n_var))

y = demanda.iloc[:,0].values
y = np.reshape(y,1))

输出

ValueError: cannot reshape array of size 17379 into shape (17379,20,1)

GitHub： repository

不清楚OP是否仍然想要答案，但是我将在注释中链接的答案进行一些修改后发布。

时间序列数据集可以具有不同的类型，让我们考虑一个以X作为要素而Y作为标签的数据集。根据问题，Y可能是X中随时间变化的样本，也可能是您要预测的另一个目标变量。

def create_dataset(X,Y,look_back=10,label_lag = -1,stride = 1):

    dataX,dataY = [],[]

    for i in range(0,(len(X)-look_back + 1),stride):
        a = X[i:(i+look_back)]
        dataX.append(a)
        b = Y[i + look_back + label_lag]
        dataY.append(b)
    return np.array(dataX),np.array(dataY)

print(features.values.shape,labels.shape)
#(619,4),(619,1)

x,y = create_dataset(X=features.values,Y=labels.values,stride=1)
(x.shape,y.shape)
#(610,10,(610,1)

其他参数的使用：

label_lag：如果X个样本在时间t处，则Y个样本将在时间t+label_lag处。默认值会将X和Y置于相同的索引t处。

X和Y的第一个样本的索引：

if label_lag is -1:
np.where(x[1,-1]==features.values)[0],np.where(y[1] == labels.values)[0]
#(10,10),(10)

if label_lag is 0:
np.where(x[1,(11)

look_back：这是您当前时间步t中数据集过去历史的样本数。 look_back为10表示一个样本中将包含t-10 to t中的样本。
stride：两个连续样本之间的索引间隔。当stride=2时，如果X的第一个样本具有来自索引0 to 10的行，则第二个样本将具有来自索引2 to 12的行。

此外，您还可以根据当前问题在Y中进行回顾，并且Y也可以是多维的。在那种情况下，更改仅是此b=Y[i:(i+look_back+label_lag)]。

TimeseriesGenerator中的keras可以实现相同的功能。

TimeseriesGenerator(features.values,labels.values,length=10,batch_size=64,stride=1)

其中length与look_back相同。默认情况下，features和labels中的间隔为1，即X中的样本将来自t-10 to t，而Y中的对应样本将为索引t+1。如果您希望两个索引都相同，则在传递生成器之前，只需shift将标签加1。

具有多个变量的时间序列的递归神经网络-TensorFlow

b342874914 回答：具有多个变量的时间序列的递归神经网络-TensorFlow

大家都在问