隐藏状态张量的顺序与返回的张量的顺序不同

2024-05-16 • 问答

作为GRU训练的一部分，我想检索隐藏状态张量。

我定义了一个具有两层的GRU：

self.lstm = nn.GRU(params.vid_embedding_dim,params.hidden_dim,2)

forward函数的定义如下（以下只是实现的一部分）：

    def forward(self,s,order,batch_size,where,anchor_is_phrase = False):
    """
    Forward prop. 
    """
      # s is of shape [128,1,300],128 is batch size
      output,(a,b) = self.lstm(s.cuda())
      output.data.contiguous()

形状是[128，400]（128是每个样本嵌入400维向量中的样本数）。

我知道out是最后一个隐藏状态的输出，因此我希望它等于b。但是，在检查值之后，我发现它确实相等，但是b包含张量的顺序不同，例如，output[0]是b[49]。我在这里想念什么吗？

谢谢。

我了解您的困惑。看一下下面的示例和注释：

# [Batch size,Sequence length,Embedding size]
inputs = torch.rand(128,5,300)
gru = nn.GRU(input_size=300,hidden_size=400,num_layers=2,batch_first=True)

with torch.no_grad():
    # output is all hidden states,for each element in the batch of the last layer in the RNN
    # a is the last hidden state of the first layer
    # b is the last hidden state of the second (last) layer
    output,(a,b) = gru(inputs)

如果我们打印出形状，它们将证实我们的理解：

print(output.shape) # torch.Size([128,400])
print(a.shape) # torch.Size([128,400])
print(b.shape) # torch.Size([128,400])

此外，我们可以测试从output获得的最后一层的批次中每个元素的最后隐藏状态是否等于b：

np.testing.assert_almost_equal(b.numpy(),output[:,:-1,:].numpy())

最后，我们可以创建一个3层的RNN，并运行相同的测试：

gru = nn.GRU(input_size=300,num_layers=3,batch_first=True)
with torch.no_grad():
    output,b,c) = gru(inputs)

np.testing.assert_almost_equal(c.numpy(),-1,:].numpy())

同样，断言通过，但仅当我们对c执行断言时，断言现在是RNN的最后一层。否则：

np.testing.assert_almost_equal(b.numpy(),:].numpy())

引发错误：

AssertionError：数组几乎不等于7个小数位

我希望这对您来说很清楚。

隐藏状态张量的顺序与返回的张量的顺序不同

for511ever 回答：隐藏状态张量的顺序与返回的张量的顺序不同

大家都在问