在BERT嵌入之上训练LSTM期间出现“ IndexError：索引超出自身范围”错误

我是nlp的新手，很抱歉，如果我错过了明显的事情。我正在尝试将LSTM用于berT嵌入之上的多类文本分类。但是在训练过程中，出现“ IndexError：索引超出自身范围”错误。

这是我的LSTM课程：

class LSTM(nn.Module):

def __init__(self,output_size,hidden_dim1,hidden_dim2,vocab_size,embedding_length,dropout,lstm_layers,bidirectional): 
super(LSTM,self).__init__()

self.output_size = output_size #3
self.hidden_dim1 = hidden_dim1 #128
self.hidden_dim2 = hidden_dim2 #64
self.vocab_size = vocab_size  #512
self.embedding_length = embedding_length #768

self.embeddings = nn.Embedding(vocab_size+1,embedding_length) 
self.lstm = nn.LSTM(embedding_length,bidirectional=bidirectional,num_layers=lstm_layers,batch_first=True)
self.fc1 = nn.Linear(hidden_dim1 * 2,hidden_dim2)
self.dropout = nn.Dropout(dropout)
self.fc2 = nn.Linear(hidden_dim2,output_size)
self.relu = nn.ReLU()

def forward(self,input_sentence):

input = self.embeddings(input_sentence)
#input = input.permute(1,2)
output,(hidden,final_cell_state) = self.lstm(input)
cat = torch.cat((hidden[-2,:,:],hidden[-1,:]),dim=1)
rel = self.relu(cat)
dense1 = self.fc1(rel)
drop = self.dropout(dense1)
preds = self.fc2(drop)

return preds

我的训练代码中出现索引错误的部分：范围内的纪元（n_epochs）：

print("Epoch {} of {} ------ ".format(epoch+1,n_epochs))
  print("Training has started...")
  start_time_training = time.time()
  train_loss = 0
  
  LSTMmodel.train(True)
  
  for iteration,batch in enumerate(train_dataloader):
    x_batch = batch[0].to(device)
    y_batch = batch[1].to(device)

    y_pred = LSTMmodel(x_batch)
    optimizer.zero_grad()
    loss = loss_fn(y_pred,y_batch)
    loss.backward()
    optimizer.step()
    train_loss += loss.item()

这是我得到的错误：

    Epoch 1 of 5 ------ 
    Training has started...
    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    <ipython-input-81-dd13af1f93cf> in <module>()
         17     y_batch = batch[1].to(device)
         18 
    ---> 19     y_pred = LSTMmodel(x_batch)
         20     optimizer.zero_grad()
         21     loss = loss_fn(y_pred,y_batch)

    4 frames
    /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in embedding(input,weight,padding_idx,max_norm,norm_type,scale_grad_by_freq,sparse)
       1722         # remove once script supports set_grad_enabled
       1723         _no_grad_embedding_renorm_(weight,input,norm_type)
    -> 1724     return torch.embedding(weight,sparse)
       1725 
       1726 

    IndexError: index out of range in self

我使用过数据加载器，我使用过的批次形状（batch_size = 8）：torch.Size（[8，512，768]）。

也许我不明白为什么我们在LSTM类中完全使用“ nn.Embedding”。谢谢您的帮助：）

在BERT嵌入之上训练LSTM期间出现“ IndexError：索引超出自身范围”错误

iCMS 回答：在BERT嵌入之上训练LSTM期间出现“ IndexError：索引超出自身范围”错误

大家都在问