我正在做一个长文本分类任务,该文档中的文档超过10000个单词,我计划使用bert作为段落编码器,然后将段落的嵌入内容逐步导入BiLSTM。 网络如下:
输入:(批处理大小,max_paragraph_len,max_tokens_per_para,embeddding_size)
伯特层:(max_paragraph_len,paragraph_embedding_size)
lstm层:???
输出层:(batch_size,classification_size)
如何在keras中实现它? 我正在使用keras的load_trained_model_from_checkpoint加载bert模型
bert_model = load_trained_model_from_checkpoint(
config_path,model_path,training=False,use_adapter=True,trainable=['Encoder-{}-MultiHeadSelfAttention-Adapter'.format(i + 1) for i in range(layer_num)] +
['Encoder-{}-FeedForward-Adapter'.format(i + 1) for i in range(layer_num)] +
['Encoder-{}-MultiHeadSelfAttention-Norm'.format(i + 1) for i in range(layer_num)] +
['Encoder-{}-FeedForward-Norm'.format(i + 1) for i in range(layer_num)],)