代码问题
我正在努力转换为顺序API制作的Tensorflow Keras层以使其在功能性API中使用。我今天已经看到很多错误,因此我将尽量使描述简短明了:
- 我正在尝试在功能性Keras API中使用此Towardsdatascience article中描述的DynamicMetaEmbedding层。在顺序API中,代码有效。这是Github repository,其中也包含相应的代码。
- 在该问题的结尾,您找到了一些源代码来重现该问题。
- 您将面临的第一个错误是:
'Tensor' object has no attribute 'input_dim'
- 但是,修复此错误后,还会出现许多其他错误。接下来,它将抱怨output_dim,修复此问题后,我注意到该层未正确执行;它仅运行 init()。另外,添加build()函数或compute_output_shape()也无济于事。我相信这里的根本问题是DynamicMetaEmbedding层实际上并不将输入视为嵌入,而是将其视为张量。
感谢您的时间和帮助! :-)
最好, 托比亚斯
在Google colab上测试。 Keras版本:2.2.5 Tensorflow版本:1.15.0
这是DynamicMetaEmbedding Keras层的Python代码:
from typing import Optional,List
import numpy as np
import tensorflow as tf
class DynamicMetaEmbedding(tf.keras.layers.Layer):
"""
Applies learned attention to different sets of embeddings matrices per token,to mix separate token
representations into a joined one. Self attention is word-dependent,meaning each word's representation in the output
is only dependent on the word's original embeddings in the given matrices,and the attention vector.
Arguments
---------
- `embedding_matrices` (``List[tf.keras.layers.Embedding]``): List of embedding layers
- `output_dim` (``int``): Dimension of the output embedding
- `name` (``str``): Layer name
Input shape
-----------
(batch_size,time_steps)
Output shape
------------
(batch_size,time_steps,output_dim)
Examples
--------
Create Dynamic Meta Embeddings using 2 separate embedding matrices. Notice it is the user's responsibility to make sure
all the arguments needed in the embedding lookup are passed to the ``tf.keras.layers.Embedding`` constructors (like ``trainable=False``).
.. code-block:: python3
import tensorflow as tf
import tavolo as tvl
w2v_embedding = tf.keras.layers.Embedding(num_words,EMBEDDING_DIM,embeddings_initializer=tf.keras.initializers.Constant(w2v_matrix),input_length=MAX_SEQUENCE_LENGTH,trainable=False)
glove_embedding = tf.keras.layers.Embedding(num_words,embeddings_initializer=tf.keras.initializers.Constant(glove_matrix),trainable=False)
model = tf.keras.Sequential([tf.keras.layers.Input(shape=(MAX_SEQUENCE_LENGTH,),dtype='int32'),tvl.embeddings.DynamicMetaEmbedding([w2v_embedding,glove_embedding])]) # Use DME embeddings
Using the same example as above,it is possible to define the output's channel size
.. code-block:: python3
model = tf.keras.Sequential([tf.keras.layers.Input(shape=(MAX_SEQUENCE_LENGTH,glove_embedding],output_dim=200)])
References
----------
`Dynamic Meta-Embeddings for Improved Sentence Representations`_
.. _`Dynamic Meta-Embeddings for Improved Sentence Representations`:
https://arxiv.org/abs/1804.07983
"""
def __init__(self,embedding_matrices: List[tf.keras.layers.Embedding],output_dim: Optional[int] = None,name: str = 'dynamic_meta_embedding',**kwargs):
"""
:param embedding_matrices: List of embedding layers
:param output_dim: Dimension of the output embedding
:param name: Layer name
"""
super().__init__(name=name,**kwargs)
# Validate all the embedding matrices have the same vocabulary size
if not len(set((e.input_dim for e in embedding_matrices))) == 1:
raise ValueError('Vocabulary sizes (first dimension) of all embedding matrices must match')
# If no output_dim is supplied,use the maximum dimension from the given matrices
self.output_dim = output_dim or min([e.output_dim for e in embedding_matrices])
self.embedding_matrices = embedding_matrices
self.n_embeddings = len(self.embedding_matrices)
self.projections = [tf.keras.layers.Dense(units=self.output_dim,activation=None,name='projection_{}'.format(i),dtype=self.dtype) for i,e in enumerate(self.embedding_matrices)]
self.attention = tf.keras.layers.Dense(units=1,name='attention',dtype=self.dtype)
def compute_mask(self,inputs,mask=None):
return self.projections[0].compute_mask(
inputs,mask=self.embedding_matrices[0].compute_mask(inputs,mask=mask))
def call(self,**kwargs) -> tf.Tensor:
batch_size,time_steps = inputs.shape[:2]
batch_size = 64
# Embedding lookup
embedded = [e(inputs) for e in self.embedding_matrices] # List of shape=(batch_size,channels_i)
# Projection
projected = tf.reshape(tf.concat([p(e) for p,e in zip(self.projections,embedded)],axis=-1),# Project embeddings
shape=(batch_size,self.n_embeddings,self.output_dim),name='projected') # shape=(batch_size,n_embeddings,output_dim)
print (projected.shape)
# Calculate attention coefficients
alphas = self.attention(projected) # shape=(batch_size,1)
alphas = tf.nn.softmax(alphas,axis=-2) # shape=(batch_size,1)
# Attend
output = tf.squeeze(tf.matmul(
tf.transpose(projected,perm=[0,1,3,2]),alphas),# Attending
name='output') # shape=(batch_size,output_dim)
return output
def get_config(self):
base_config = super().get_config()
base_config['embedding_matrices'] = [e.get_config() for e in self.embedding_matrices]
base_config['output_dim'] = self.output_dim
return base_config
@classmethod
def from_config(cls,config: dict):
embedding_matrices = [tf.keras.layers.Embedding.from_config(e_conf) for e_conf in
config.pop('embedding_matrices')]
return cls(embedding_matrices=embedding_matrices,**config)
这是稍后在模型中调用Keras层的示例代码。
from keras.layers import Input,Embedding
from keras.models import Model
MAX_WORDS = 20
vocab_size = 5000
EMBEDDING_DIM = 300
def build_MetaEmbeddings_model():
sent_inputs = Input(shape=(MAX_WORDS,dtype="int32")
embeddings1 = Embedding(input_dim=vocab_size,output_dim=EMBEDDING_DIM,mask_zero=True)(sent_inputs)
embeddings2 = Embedding(input_dim=vocab_size,mask_zero=True)(sent_inputs)
dme = DynamicMetaEmbedding([embeddings1,embeddings2])
model = Model(inputs=sent_inputs,outputs=dme)
return model
model = build_MetaEmbeddings_model()
问题答案
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:starryrocklee#gmail.com (将#修改为@)
如果觉得前端之家所整理的内容很不错的话,欢迎点击下方分享按钮,转发给身边开发程序员好友。