Chris Pollett > Students >
Pratikkumar

    ( Print View)

    [Bio]

    [Blog]

    [CS 297 Proposal]

    [PyTorch Bootcamp-IPYNB]

    [Basics of Sanskrit-PDF]

    [Autoencoders-PDF]

    [Overview of GANs-PDF]

    [DeepFakes and Beyond_A Survey-PDF]

    [DeepPrivacy-PDF]

    [Fake_Video_Detection-PDF]

    [Deliverable 1]

    [Deliverable 2]

    [Deliverable 3]

    [Deliverable 4]

    [CS297_Final_Report-PDF]

    [CS298 Proposal]

    [CS298_Final_Report-PDF]

    [CS298_slides-PDF]

    [CS298_Code-ZIP]

Deliverable 3: Generative adversarial networks (GANs) to generate
Devanagari characters.

Data Collection:

  • Dataset is available at: https://www.kaggle.com/rishianand/devanagari-character-set
  • It comprises 92000 images [32x32 px] corresponding to 46 characters, consonants "ka" to "gya", and the digits 0 to 9.
  • 33 original Devanagari alphabets + 3 Nepali alphabets + 10 digits = 46 characters
  • The vowels are missing.

GAN models

We tried 2 versions of GANs. First using plain ANN and the other using CNN.
PyTorch implementation can be found at:
devnagari-gan.zip Experiment specific images can be found in the Notes section below.

Sr#Model description Notes
Model 1
  Generator(
    (fc1): Linear(in_features=100 out_features=256 bias=True)
    (fc2): Linear(in_features=256 out_features=512 bias=True)
    (fc3): Linear(in_features=512 out_features=1024 bias=True)
    (fc4): Linear(in_features=1024 out_features=1024 bias=True)
  )
  Discriminator(
    (fc1): Linear(in_features=1024 out_features=1024 bias=True)
    (fc2): Linear(in_features=1024 out_features=512 bias=True)
    (fc3): Linear(in_features=512 out_features=256 bias=True)
    (fc4): Linear(in_features=256 out_features=1 bias=True)
  )
Code at devnagari_gan_1.ipynb
and Sample images can be found
in devnagari_gan_1_samples
directory
Model 2
    
  Generator(
    (main): Sequential(
    (0): ConvTranspose2d(100, 512, kernel_size=(4, 4), stride=(1, 1), bias=False)
    (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): ConvTranspose2d(512, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (4): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU(inplace=True)
    (6): ConvTranspose2d(256, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (7): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (8): ReLU(inplace=True)
    (9): ConvTranspose2d(128, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (10): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (11): ReLU(inplace=True)
    (12): ConvTranspose2d(64, 1, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (13): Tanh())
    )

  Discriminator(
    (main): Sequential(
    (0): Conv2d(1, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (1): LeakyReLU(negative_slope=0.2, inplace=True)
    (2): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (3): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (4): LeakyReLU(negative_slope=0.2, inplace=True)
    (5): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (6): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): LeakyReLU(negative_slope=0.2, inplace=True)
    (8): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (9): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (10): LeakyReLU(negative_slope=0.2, inplace=True)
    (11): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), bias=False)
    (12): Sigmoid())
  )

Code at devnagari_gan_5.ipynb
and Sample images can be found
in devnagari_gan_5_samples
directory

Conclusion

Model#2 of the GAN generated realistic-looking characters compared to Model#1, as CNN based approach learns the features
of the input image. Compared to VAE model# 3, GAN model#2 has generated better-looking images of the fake Devanagari characters.