如何区分输入张量的“ k”份[Swift for TensorFlow] 在进行过程中建立连接的张量。追加到新的阵列。使GatedTransformerInput为可区分的类型。为callAsFunction(...)定义自定义衍生产品。

2024-05-01 • 问答

任务

我正在尝试使用Swift for TensorFlow实现Gated-GAN架构。在Gated-GAN的生成器中，有k个卷积块（“ Gated-Transformer”），每个卷积块都接收编码图像的副本。该层的输出是每个卷积块的输出的单个加权总和。

将来，我希望能够在训练模型后增加k（例如，微调第k+1个变压器）。因此，我不想将k硬编码到体系结构中。

Gated-Transformer diagram (I don't have enough reputation to post the picture directly)

最小工作示例

写这样的东西会很方便：

struct GatedTransformer: Layer {

    // Normally the blocks in the transformer are residual blocks,but
    // for simplicity I'll just use Conv2D here.
    var convBlocks: [Conv2D<Float>]

    /// Custom differentiable input (needed since the layer has two inputs).
    struct GatedTransformerInput: Differentiable {
        var image: Tensor<Float>  // shape=[batch_size,height,width,channel]
        var classDistributions: Tensor<Float>  // shape=[class_count]

        @differentiable
        public init(image: Tensor<Float>,classDistributions: Tensor<Float>) {
            self.image = image
            self.classDistributions = classDistributions
        }
    }

    public init(_ classCount: Int) {
        precondition(classCount > 0)

        // Some example parameters for Conv2D.
        convBlocks = [Conv2D](repeating: Conv2D(filterShape: (3,3,128,128),strides: (1,1)),count: classCount)
    }

    var classCount: Int { get { return convBlocks.count } }

    @differentiable
    func callAsFunction(_ input: GatedTransformerInput) -> Tensor<Float> {
        precondition(input.classDistributions.shape.dimensions.last! == self.classCount)

        // <problematic_code id=0>
        var imageArray: [Tensor<Float>] = [Tensor<Float>](repeating: input.image,count: self.classCount)
        for i in 0..<self.classCount {
            imageArray[i] = convBlocks[i](input.image).expandingShape(at: 1)
        }
        let result: Tensor<Float> = Tensor<Float>(concatenating: imageArray,alongAxis: 1)
        // </problematic_code>

        // concatenate Tensors,multiply by class distributions,then sum along 'class' axis.
        let highRankFactors: Tensor<Float> = input.classDistributions.expandingShape(at: [2,4])
        let broadcastedFactors: Tensor<Float> = highRankFactors.broadcasted(to: result.shape)
        return (broadcastedFactors * result).sum(squeezingAxes: 1)
    }
}

但是，此操作因编译器错误而失败：

cannot differentiate through a non-differentiable result; do you want to use 'withoutDerivative(at:)'?
    var imageArray: [Tensor<Float>] = [Tensor<Float>](repeating: input.image,count: self.classCount)
                                                                                     ^

替代方法

在进行过程中建立连接的张量。

此方法失败，因为重新分配给result不可区分。相似的方法可能是使用Tensor +=运算符，尽管它也不编译（official API并未将Tensor.+=实现为@differentiable函数）。

// <problematic_code id=1>
var result: Tensor<Float> = convBlocks[0](input.image).expandingShape(at: 1)
for i in 0..<self.classCount {
    let nextResult: Tensor<Float> = convBlocks[i](input.image).expandingShape(at: 1)
    result = result.concatenated(with: nextResult,alongAxis: 1)
}
// </problematic_code>

追加到新的阵列。

此操作失败，因为Array.append不可区分。

// <problematic_code id=2>
var imageArray: [Tensor<Float>] = []
for i in 0..<self.classCount {
    imageArray.append(convBlocks[i](input.image).expandingShape(at: 1))
}
let result: Tensor<Float> = Tensor<Float>(concatenating: imageArray,alongAxis: 1)
// </problematic_code>

使`GatedTransformerInput`为可区分的类型。

我假设有一种方法可以使其正常工作。但是，这将涉及使GatedTransformerInput与VectorProtocol一致，这似乎比需要的工作还要多。

为`callAsFunction(...)`定义自定义衍生产品。

这可能是另一种可能的方法。但是，要计算导数，我需要Tensor s的中间数组。这些值对于外部函数不可见，该外部函数只能看到callAsFunction(...)

的输入和输出。

总结性问题

是否有一种方法可以利用Swift的现有Differentiable类型为任意k实施Gated-Transformer？

如果没有，我应该如何设计一个新的Differentiable类型，使我能够完成上述操作？

wsydy2006 回答：如何区分输入张量的“ k”份[Swift for TensorFlow] 在进行过程中建立连接的张量。追加到新的阵列。使GatedTransformerInput为可区分的类型。为callAsFunction(...)定义自定义衍生产品。

答案

您的最小工作示例非常接近您想做的事，您可以稍作调整即可解决它。

编译器错误

cannot differentiate through a non-differentiable result; do you want to use 'withoutDerivative(at:)'?
    var imageArray: [Tensor<Float>] = [Tensor<Float>](repeating: input.image,count: self.classCount)
                                                                                     ^

之所以发生，是因为编译器正在尝试确定对self.classCount的这种使用对梯度的贡献。尝试执行此操作失败，因为self.classCount不可微分（因为它是不可微分的整数类型）。

在这种情况下，您实际上并不希望使用self.classCount来促进渐变，因此您可以通过编写withoutDerivative(at: self).classCount使编译器满意。这告诉编译器忽略self的特定用法对梯度的影响，即使编译器在self方面有所区别。

替代方法

您的某些替代方法也可以使用。

级联张量

在for循环边界中使用withoutDerivative(at: self).classCount。

注1：可以改进此问题的编译器诊断。还不清楚。

注2：我不确定为什么在第一个示例的for循环绑定中不需要withoutDerivative(at: self).classCount。

使用张量+ =

张量+ =无法区分，因为Swift AD当前（截至2019-11-05）不支持区分变异函数，例如+ =。但是，您应该能够区分使用result = result + foo而不是result += foo的函数。

`callAsFunction(_:)`的自定义衍生产品

定义自定义导数时，可以指定返回原始结果的“ VJP”函数和计算导数的“回拉”闭包。 “回拉”闭包可以从原始结果计算中捕获中间值。例如

func vjpCallAsFunction(_ input: GatedTransformerInput) -> (Tensor<Float>,...) {
  ...
  var imageArray = ...
  ...

  func pullback(...) -> ... {
    ...
    <use imageArray>
    ...
  }

  return (result,pullback)
}

当然，这会导致callAsFunction及其VJP之间的代码重复。因此，最好在可能的情况下使用自动计算的导数。

machine-learning neural-network swift tensorflow

本文链接：https://www.f2er.com/3162970.html

如何区分输入张量的“ k”份[Swift for TensorFlow] 在进行过程中建立连接的张量。追加到新的阵列。使GatedTransformerInput为可区分的类型。为callAsFunction(...)定义自定义衍生产品。

任务

最小工作示例

替代方法

在进行过程中建立连接的张量。

追加到新的阵列。

使GatedTransformerInput为可区分的类型。

为callAsFunction(...)定义自定义衍生产品。

总结性问题

wsydy2006 回答：如何区分输入张量的“ k”份[Swift for TensorFlow] 在进行过程中建立连接的张量。追加到新的阵列。使GatedTransformerInput为可区分的类型。为callAsFunction(...)定义自定义衍生产品。

答案

替代方法

级联张量

使用张量+ =

callAsFunction(_:)的自定义衍生产品

大家都在问

使`GatedTransformerInput`为可区分的类型。

为`callAsFunction(...)`定义自定义衍生产品。

`callAsFunction(_:)`的自定义衍生产品