在前向传递期间对层使用单独的设备

2024-05-16 • 问答

我正在使用名为 PyroNN 的 tf.custom_ops。我将此运算符包装在 tf.keras.layers.Layer 中，以便在使用函数式 API 创建的模型中使用它。

问题是，这个算子没有使用tensorflows gpu内存管理，消耗大量内存。这会导致 GPUassert: Out of memory 之类的错误并终止训练。

是否可以将层的计算委托给单独的设备？我会有这样的想法：

import tensorflow.keras as K
def crossd_unet_distributed():
    inputs = K.layers.Input((200,488,488),dtype=tf.float32,name='line_integrals')

    with tf.device('/GPU:0'):
        x = K.layers.Conv3D(8,3,padding="same",activation='relu')(inputs)

    with tf.device('/GPU:1'):
        # this custom layer messes with gpu memory,so we place it on its own gpu
        x = SpinBackProjector(detector_shape=(488,volume_shape=(256,256,256))(x)

    with tf.device('/GPU:0'):
        outputs = K.layers.Conv3D(1,activation='relu')(x)

    return K.Model(inputs=inputs,outputs=outputs)

Keras 是否能够跟踪多个 gpu 上的梯度？我尝试通读 distributed training 文档，但不确定这是否是我需要的。

在前向传递期间对层使用单独的设备

appyle 回答：在前向传递期间对层使用单独的设备

大家都在问