使用Tensorflow数据集训练Keras顺序模型时出现2GB限制错误

2024-05-07 • 问答

我正在使用tf.data.experimental.make_csv_dataset函数创建Keras顺序模型的输入。我的第一层是DenseFeature，它接收tf.feature_column的列表（指标，存储桶，数字等）。以下层是使用relu激活的密集层。当我运行fit函数时，出现错误：“无法创建张量原型，其内容大于2GB。”。我需要进行哪些更改才能使此模型训练？

下面是代码的主要部分：

train_input = tf.data.experimental.make_csv_dataset(["df_train.csv"],batch_size=64,label_name="loss_rate",num_epochs=1)
eval_input = tf.data.experimental.make_csv_dataset(["df_val.csv"],shuffle=False,num_epochs=1)

#all_features is generated by a function (it has 87 tf.feature_column objects)
feature_layer = layers.DenseFeatures(all_features)

def deep_sequential_model():
    model = tf.keras.Sequential([
        feature_layer,layers.Dense(64,activation='relu'),layers.Dense(32,layers.Dense(1,activation='sigmoid')
    ])

    optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

    model.compile(loss='mse',optimizer=optimizer,metrics=['mae','mse'])

    return model

model = deep_sequential_model()
model.fit(train_input,validation_data=eval_input,epochs=10)

我遇到了错误：

/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py in __init__(self,node_def,g,inputs,output_types,control_inputs,input_types,original_op,op_def)
   1696             "Cannot create a tensor proto whose content is larger than 2GB.")
   1697       if not _VALID_OP_NAME_REGEX.match(node_def.name):
-> 1698         raise ValueError("'%s' is not a valid node name" % node_def.name)
   1699       c_op = None
   1700     elif type(node_def).__name__ == "SwigPyObject":

ValueError: '_5' is not a valid node name```

使用Tensorflow数据集训练Keras顺序模型时出现2GB限制错误

sqli182 回答：使用Tensorflow数据集训练Keras顺序模型时出现2GB限制错误

大家都在问