Pytorch BiLSTM POS标记问题:RuntimeError:input.size(-1)必须等于input_size。预期为6,得到12

我有一个nlp数据集,根据Pytorch官方教程,我将数据集更改为word_to_idx和tag_to_idx,例如:

word_to_idx =  {'I': 0,'have': 1,'used': 2,'transfers': 3,'on': 4,'three': 5,'occasions': 6,'now': 7,'and': 8,'each': 9,'time': 10}
tag_to_idx =  {'PRON': 0,'VERB': 1,'NOUN': 2,'ADP': 3,'NUM': 4,'ADV': 5,'CONJ': 6,'DET': 7,'ADJ': 8,'PRT': 9,'.': 10,'X': 11}

我想用BiLSTM完成POS标记任务。这是我的BiLSTM代码:

class LSTMTagger(nn.Module):

    def __init__(self,embedding_dim,hidden_dim,vocab_size,tagset_size):
        super(LSTMTagger,self).__init__()
        self.hidden_dim = hidden_dim
        self.word_embeddings = nn.Embedding(vocab_size,tagset_size)

        # The LSTM takes word embeddings as inputs,and outputs hidden states
        self.lstm = nn.LSTM(embedding_dim,bidirectional=True)

        # The linear layer that maps from hidden state space to tag space
        self.hidden2tag = nn.Linear(in_features=hidden_dim * 2,out_features=tagset_size)


    def forward(self,sentence):
        embeds = self.word_embeddings(sentence)
        lstm_out,_ = self.lstm(embeds.view(len(sentence),1,-1))
        tag_space = self.hidden2tag(lstm_out.view(len(sentence),-1))

        # tag_scores = F.softmax(tag_space,dim=1)
        tag_scores = F.log_softmax(tag_space,dim=1)
        return tag_scores

然后我在Pycharm中运行训练代码,例如:

EMBEDDING_DIM = 6
HIDDEN_DIM = 6
NUM_EPOCHS = 3

model = LSTMTagger(embedding_dim=EMBEDDING_DIM,hidden_dim=HIDDEN_DIM,vocab_size=len(word_to_idx),tagset_size=len(tag_to_idx))

loss_function = nn.NLLLoss()
optimizer = optim.SGD(model.parameters(),lr=0.1)

# See what the scores are before training
with torch.no_grad():
    inputs = prepare_sequence(training_data[0][0],word_to_idx)
    tag_scores = model(inputs)
    print(tag_scores)
    print(tag_scores.size())

但是,它显示行tag_scores = model(inputs)和行lstm_out,-1))的错误。 错误是:

Traceback (most recent call last):
  line 140,in <module>
    tag_scores = model(inputs)
  File "/library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/module.py",line 493,in __call__
    result = self.forward(*input,**kwargs)
  line 115,in forward
    lstm_out,-1))
  File "/library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/module.py",**kwargs)
  File "/library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py",line 559,in forward
    return self.forward_tensor(input,hx)
  File "/library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py",line 539,in forward_tensor
    output,hidden = self.forward_impl(input,hx,batch_sizes,max_batch_size,sorted_indices)
  File "/library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py",line 519,in forward_impl
    self.check_forward_args(input,batch_sizes)
  File "/library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/rnn.py",line 490,in check_forward_args
    self.check_input(input,line 153,in check_input
    self.input_size,input.size(-1)))
RuntimeError: input.size(-1) must be equal to input_size. Expected 6,got 12

我不知道如何使用它进行调试。有人可以帮我解决这个问题吗?预先感谢!

saigemarket 回答:Pytorch BiLSTM POS标记问题:RuntimeError:input.size(-1)必须等于input_size。预期为6,得到12

错误在这里:

self.word_embeddings = nn.Embedding(vocab_size,tagset_size)

您使用的标签数量不是12,而不是LSTM层期望的6,而不是使用嵌入维度。

本文链接:https://www.f2er.com/3168771.html

大家都在问