任务
我正在尝试使用Swift for TensorFlow实现Gated-GAN架构。在Gated-GAN的生成器中,有k
个卷积块(“ Gated-Transformer”),每个卷积块都接收编码图像的副本。该层的输出是每个卷积块的输出的单个加权总和。
将来,我希望能够在训练模型后增加k
(例如,微调第k+1
个变压器)。因此,我不想将k
硬编码到体系结构中。
Gated-Transformer diagram (I don't have enough reputation to post the picture directly)
最小工作示例
写这样的东西会很方便:
struct GatedTransformer: Layer {
// Normally the blocks in the transformer are residual blocks,but
// for simplicity I'll just use Conv2D here.
var convBlocks: [Conv2D<Float>]
/// Custom differentiable input (needed since the layer has two inputs).
struct GatedTransformerInput: Differentiable {
var image: Tensor<Float> // shape=[batch_size,height,width,channel]
var classDistributions: Tensor<Float> // shape=[class_count]
@differentiable
public init(image: Tensor<Float>,classDistributions: Tensor<Float>) {
self.image = image
self.classDistributions = classDistributions
}
}
public init(_ classCount: Int) {
precondition(classCount > 0)
// Some example parameters for Conv2D.
convBlocks = [Conv2D](repeating: Conv2D(filterShape: (3,3,128,128),strides: (1,1)),count: classCount)
}
var classCount: Int { get { return convBlocks.count } }
@differentiable
func callAsFunction(_ input: GatedTransformerInput) -> Tensor<Float> {
precondition(input.classDistributions.shape.dimensions.last! == self.classCount)
// <problematic_code id=0>
var imageArray: [Tensor<Float>] = [Tensor<Float>](repeating: input.image,count: self.classCount)
for i in 0..<self.classCount {
imageArray[i] = convBlocks[i](input.image).expandingShape(at: 1)
}
let result: Tensor<Float> = Tensor<Float>(concatenating: imageArray,alongAxis: 1)
// </problematic_code>
// concatenate Tensors,multiply by class distributions,then sum along 'class' axis.
let highRankFactors: Tensor<Float> = input.classDistributions.expandingShape(at: [2,4])
let broadcastedFactors: Tensor<Float> = highRankFactors.broadcasted(to: result.shape)
return (broadcastedFactors * result).sum(squeezingAxes: 1)
}
}
但是,此操作因编译器错误而失败:
cannot differentiate through a non-differentiable result; do you want to use 'withoutDerivative(at:)'?
var imageArray: [Tensor<Float>] = [Tensor<Float>](repeating: input.image,count: self.classCount)
^
替代方法
在进行过程中建立连接的张量。
此方法失败,因为重新分配给result
不可区分。相似的方法可能是使用Tensor +=
运算符,尽管它也不编译(official API并未将Tensor.+=
实现为@differentiable
函数)。
// <problematic_code id=1>
var result: Tensor<Float> = convBlocks[0](input.image).expandingShape(at: 1)
for i in 0..<self.classCount {
let nextResult: Tensor<Float> = convBlocks[i](input.image).expandingShape(at: 1)
result = result.concatenated(with: nextResult,alongAxis: 1)
}
// </problematic_code>
追加到新的阵列。
此操作失败,因为Array.append
不可区分。
// <problematic_code id=2>
var imageArray: [Tensor<Float>] = []
for i in 0..<self.classCount {
imageArray.append(convBlocks[i](input.image).expandingShape(at: 1))
}
let result: Tensor<Float> = Tensor<Float>(concatenating: imageArray,alongAxis: 1)
// </problematic_code>
使GatedTransformerInput
为可区分的类型。
我假设有一种方法可以使其正常工作。但是,这将涉及使GatedTransformerInput
与VectorProtocol
一致,这似乎比需要的工作还要多。
为callAsFunction(...)
定义自定义衍生产品。
这可能是另一种可能的方法。但是,要计算导数,我需要Tensor
s的中间数组。这些值对于外部函数不可见,该外部函数只能看到callAsFunction(...)
总结性问题
是否有一种方法可以利用Swift的现有Differentiable
类型为任意k
实施Gated-Transformer?
如果没有,我应该如何设计一个新的Differentiable
类型,使我能够完成上述操作?