ARTICLE AD BOX
I’m implementing a U-Net style residual block in Keras:
class UnetResBlock(layers.Layer): def __init__(self, spatial_dims, in_channels, out_channels, ...): super().__init__() # Main conv layers self.conv1 = Convolution(...) self.conv2 = Convolution(...) self.norm1 = ... self.norm2 = ... # Residual path layers will be created in build() self.res_conv = None self.res_norm = None def build(self, input_shape): in_channels = input_shape[-1] if in_channels != self.out_channels: self.res_conv = Convolution( in_channels=in_channels, out_channels=self.out_channels, kernel_size=1, strides=self.stride, ) self.res_norm = layers.BatchNormalization() super().build(input_shape) def call(self, x, training=False): out = self.conv1(x, training=training) ... if self.res_conv is not None: x = self.res_conv(x, training=training)My question is:
I’m using build() to defer the creation of res_conv and res_norm until I know the input shape, since the residual path may need a 1x1 convolution to match channels.
Is this an appropriate use of build()?
Or is build() meant only for deferring weight creation for the current layer, and creating sub-layers here could cause unexpected behavior?
Are there any pitfalls of creating new Layers in build() vs in __init__?
