Deliverable 4: Face-swap GAN in PyTorch

Data Collection:

FDF (Flickr Diverse Faces) Dataset is available at: https://github.com/hukkelas/FDF
It contains 1.5M faces "in the wild"
Extracted dataset is around 50GB.

GAN model architecture

DeepPrivacy (https://arxiv.org/abs/1909.04538) paper's reference implementation is open-sourced by the authors at https://github.com/hukkelas/DeepPrivacy.
This code is taken as reference and further modification are made.
The generator of the original DeepPivacy GAN is shown in Figure 1. The architecture includes all the layers shown in the picture, in
an encoder-decoder pair. We have simplified the generator architecture by removing layers marked by orange crosses. The model was trained
on a high-end machine for approx 50 hours. The discriminator is not changed.
The specification of the high-end machine: Intel 28 cores CPU, 64GB dram, Nvidia Titan RTX GPU (24GB vram)

                  Figure 1. The architecture of the Generator of the DeepPrivacy GAN.

Architecutes of the models can be seen in below table.

Model

Original Generator architecture

New Generator architecture

Generator


NetworkWrapper(
  (network): Generator(
    (to_rgb_new): WSConv2dConv2d-[3, 256, 1, 1]
    (to_rgb_old): WSConv2dConv2d-[3, 256, 1, 1]
    (core_blocks_down): ModuleList(
      (0): UnetDownSamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (core_blocks_up): ModuleList(
      (0): Sequential(
        (0): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 295, 1, 1]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): UnetUpsamplingBlock(
            (model): Sequential(
              (0): Sequential(
                (0): WSConv2dConv2d-[256, 256, 3, 3]
                (1): LeakyReLU(negative_slope=0.2)
                (2): PixelwiseNormalization()
              )
              (1): Sequential(
                (0): WSConv2dConv2d-[256, 256, 3, 3]
                (1): LeakyReLU(negative_slope=0.2)
                (2): PixelwiseNormalization()
              )
            )
          )
        )
        (1): UpSamplingBlock()
      )
    )
    (new_up): Sequential(
      (0): Sequential(
        (0): WSConv2dConv2d-[256, 519, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
        (2): PixelwiseNormalization()
      )
      (1): UnetUpsamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (old_up): Sequential()
    (new_down): Sequential(
      (0): UnetDownSamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (from_rgb_new): Sequential(
      (0): WSConv2dConv2d-[256, 3, 1, 1]
      (1): LeakyReLU(negative_slope=0.2)
      (2): PixelwiseNormalization()
    )
    (from_rgb_old): Sequential(
      (0): AvgPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0)
      (1): Sequential(
        (0): WSConv2dConv2d-[256, 3, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
        (2): PixelwiseNormalization()
      )
    )
    (upsampling): UpSamplingBlock()
    (downsampling): AvgPool2d(kernel_size=2, stride=2, padding=0)
  )
  (forward_block): Generator(
    (to_rgb_new): WSConv2dConv2d-[3, 256, 1, 1]
    (to_rgb_old): WSConv2dConv2d-[3, 256, 1, 1]
    (core_blocks_down): ModuleList(
      (0): UnetDownSamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (core_blocks_up): ModuleList(
      (0): Sequential(
        (0): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 295, 1, 1]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): UnetUpsamplingBlock(
            (model): Sequential(
              (0): Sequential(
                (0): WSConv2dConv2d-[256, 256, 3, 3]
                (1): LeakyReLU(negative_slope=0.2)
                (2): PixelwiseNormalization()
              )
              (1): Sequential(
                (0): WSConv2dConv2d-[256, 256, 3, 3]
                (1): LeakyReLU(negative_slope=0.2)
                (2): PixelwiseNormalization()
              )
            )
          )
        )
        (1): UpSamplingBlock()
      )
    )
    (new_up): Sequential(
      (0): Sequential(
        (0): WSConv2dConv2d-[256, 519, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
        (2): PixelwiseNormalization()
      )
      (1): UnetUpsamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (old_up): Sequential()
    (new_down): Sequential(
      (0): UnetDownSamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (from_rgb_new): Sequential(
      (0): WSConv2dConv2d-[256, 3, 1, 1]
      (1): LeakyReLU(negative_slope=0.2)
      (2): PixelwiseNormalization()
    )
    (from_rgb_old): Sequential(
      (0): AvgPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0)
      (1): Sequential(
        (0): WSConv2dConv2d-[256, 3, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
        (2): PixelwiseNormalization()
      )
    )
    (upsampling): UpSamplingBlock()
    (downsampling): AvgPool2d(kernel_size=2, stride=2, padding=0)
  )
)


NetworkWrapper(
  (network): Generator(
    (to_rgb_new): WSConv2dConv2d-[3, 256, 1, 1]
    (to_rgb_old): WSConv2dConv2d-[3, 256, 1, 1]
    (core_blocks_down): ModuleList(
      (0): UnetDownSamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (core_blocks_up): ModuleList(
      (0): Sequential(
        (0): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 295, 1, 1]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): UnetUpsamplingBlock(
            (model): Sequential(
              (0): Sequential(
                (0): WSConv2dConv2d-[256, 256, 3, 3]
                (1): LeakyReLU(negative_slope=0.2)
                (2): PixelwiseNormalization()
              )
            )
          )
        )
        (1): UpSamplingBlock()
      )
    )
    (new_up): Sequential(
      (0): Sequential(
        (0): WSConv2dConv2d-[256, 519, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
        (2): PixelwiseNormalization()
      )
      (1): UnetUpsamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (old_up): Sequential()
    (new_down): Sequential(
      (0): UnetDownSamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (from_rgb_new): Sequential(
      (0): WSConv2dConv2d-[256, 3, 1, 1]
      (1): LeakyReLU(negative_slope=0.2)
      (2): PixelwiseNormalization()
    )
    (from_rgb_old): Sequential(
      (0): AvgPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0)
      (1): Sequential(
        (0): WSConv2dConv2d-[256, 3, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
        (2): PixelwiseNormalization()
      )
    )
    (upsampling): UpSamplingBlock()
    (downsampling): AvgPool2d(kernel_size=2, stride=2, padding=0)
  )
  (forward_block): Generator(
    (to_rgb_new): WSConv2dConv2d-[3, 256, 1, 1]
    (to_rgb_old): WSConv2dConv2d-[3, 256, 1, 1]
    (core_blocks_down): ModuleList(
      (0): UnetDownSamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (core_blocks_up): ModuleList(
      (0): Sequential(
        (0): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 295, 1, 1]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
          (1): UnetUpsamplingBlock(
            (model): Sequential(
              (0): Sequential(
                (0): WSConv2dConv2d-[256, 256, 3, 3]
                (1): LeakyReLU(negative_slope=0.2)
                (2): PixelwiseNormalization()
              )
            )
          )
        )
        (1): UpSamplingBlock()
      )
    )
    (new_up): Sequential(
      (0): Sequential(
        (0): WSConv2dConv2d-[256, 519, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
        (2): PixelwiseNormalization()
      )
      (1): UnetUpsamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (old_up): Sequential()
    (new_down): Sequential(
      (0): UnetDownSamplingBlock(
        (model): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
            (2): PixelwiseNormalization()
          )
        )
      )
    )
    (from_rgb_new): Sequential(
      (0): WSConv2dConv2d-[256, 3, 1, 1]
      (1): LeakyReLU(negative_slope=0.2)
      (2): PixelwiseNormalization()
    )
    (from_rgb_old): Sequential(
      (0): AvgPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0)
      (1): Sequential(
        (0): WSConv2dConv2d-[256, 3, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
        (2): PixelwiseNormalization()
      )
    )
    (upsampling): UpSamplingBlock()
    (downsampling): AvgPool2d(kernel_size=2, stride=2, padding=0)
  )
)

Discriminator

NetworkWrapper(
  (network): DeepDiscriminator(
    (from_rgb_new): Sequential(
      (0): WSConv2dConv2d-[256, 6, 1, 1]
      (1): LeakyReLU(negative_slope=0.2)
    )
    (from_rgb_old): Sequential(
      (0): AvgPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0)
      (1): Sequential(
        (0): WSConv2dConv2d-[256, 6, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
      )
    )
    (new_block): Sequential(
      (0): Sequential(
        (0): WSConv2dConv2d-[256, 263, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
      )
      (1): ResNetBlock(
        (conv): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
          )
          (1): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
          )
          (2): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
          )
          (3): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
          )
        )
      )
      (2): Sequential(
        (0): WSConv2dConv2d-[256, 256, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
      )
      (3): AvgPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0)
    )
    (core_model): Sequential(
      (0): Sequential(
        (0): Sequential(
          (0): WSConv2dConv2d-[256, 263, 1, 1]
          (1): LeakyReLU(negative_slope=0.2)
        )
        (1): ResNetBlock(
          (conv): Sequential(
            (0): Sequential(
              (0): WSConv2dConv2d-[256, 256, 3, 3]
              (1): LeakyReLU(negative_slope=0.2)
            )
            (1): Sequential(
              (0): WSConv2dConv2d-[256, 256, 3, 3]
              (1): LeakyReLU(negative_slope=0.2)
            )
            (2): Sequential(
              (0): WSConv2dConv2d-[256, 256, 3, 3]
              (1): LeakyReLU(negative_slope=0.2)
            )
          )
        )
        (2): Sequential(
          (0): WSConv2dConv2d-[256, 256, 4, 4]
          (1): LeakyReLU(negative_slope=0.2)
        )
      )
    )
    (output_layer): WSLinear(
      (linear): Linear(in_features=256, out_features=1, bias=False)
    )
  )
  (forward_block): DeepDiscriminator(
    (from_rgb_new): Sequential(
      (0): WSConv2dConv2d-[256, 6, 1, 1]
      (1): LeakyReLU(negative_slope=0.2)
    )
    (from_rgb_old): Sequential(
      (0): AvgPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0)
      (1): Sequential(
        (0): WSConv2dConv2d-[256, 6, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
      )
    )
    (new_block): Sequential(
      (0): Sequential(
        (0): WSConv2dConv2d-[256, 263, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
      )
      (1): ResNetBlock(
        (conv): Sequential(
          (0): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
          )
          (1): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
          )
          (2): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
          )
          (3): Sequential(
            (0): WSConv2dConv2d-[256, 256, 3, 3]
            (1): LeakyReLU(negative_slope=0.2)
          )
        )
      )
      (2): Sequential(
        (0): WSConv2dConv2d-[256, 256, 1, 1]
        (1): LeakyReLU(negative_slope=0.2)
      )
      (3): AvgPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0)
    )
    (core_model): Sequential(
      (0): Sequential(
        (0): Sequential(
          (0): WSConv2dConv2d-[256, 263, 1, 1]
          (1): LeakyReLU(negative_slope=0.2)
        )
        (1): ResNetBlock(
          (conv): Sequential(
            (0): Sequential(
              (0): WSConv2dConv2d-[256, 256, 3, 3]
              (1): LeakyReLU(negative_slope=0.2)
            )
            (1): Sequential(
              (0): WSConv2dConv2d-[256, 256, 3, 3]
              (1): LeakyReLU(negative_slope=0.2)
            )
            (2): Sequential(
              (0): WSConv2dConv2d-[256, 256, 3, 3]
              (1): LeakyReLU(negative_slope=0.2)
            )
          )
        )
        (2): Sequential(
          (0): WSConv2dConv2d-[256, 256, 4, 4]
          (1): LeakyReLU(negative_slope=0.2)
        )
      )
    )
    (output_layer): WSLinear(
      (linear): Linear(in_features=256, out_features=1, bias=False)
    )
  )
)

No Change. Used original discriminator only.

Code

PyTorch implementation can be found at: deep_privacy.tgz. It contains the changes made in deep_privacy python module from the original paper.

Sample output

Conclusion

The modified Deep Privacy model also seems to work up to a certain level. Some of the images in sample output seem to generate more human-like faces with different features.
Training longer might generate even more realistic faces.