Deliverable 1: Implement CNN to classify Devanagari characters.

Data Collection:

  • Dataset is available at: https://www.kaggle.com/rishianand/devanagari-character-set
  • It comprises 92000 images [32x32 px] corresponding to 46 characters, consonants "ka" to "gya", and the digits 0 to 9.
  • 33 original Devanagari alphabets + 3 Nepali alphabets + 10 digits = 46 characters
  • The vowels are missing.

Experiments with CNN models:

Experiments described in the following table are performed with the CNN model to achieve high accuracy.
PyTorch implementation can be found at:
devnagari-classification-.zip
Sr#Model description Test Accuracy achieved
Model 1
  CNNDevnagari(
      (conv1): Conv2d(1, 12, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
      (conv2): Conv2d(12, 16, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
      (fc1): Linear(in_features=1296, out_features=120, bias=True)
      (fc2): Linear(in_features=120, out_features=90, bias=True)
      (fc3): Linear(in_features=90, out_features=46, bias=True)
  )  
96.2101%
Model 2
  CNNDevnagari(
  (features): Sequential(
    (0): Conv2d(1, 15, kernel_size=(15, 15), stride=(1, 1), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(15, 16, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )

  (classifier): Sequential(
    (0): Linear(in_features=576, out_features=460, bias=True)
    (1): ReLU(inplace=True)
    (2): Linear(in_features=460, out_features=276, bias=True)
    (3): ReLU(inplace=True)
    (4): Linear(in_features=276, out_features=46, bias=True)
  )
)
95.3261%
Model 2
  CNNDevnagari(
  (features): Sequential(
    (0): Conv2d(1, 15, kernel_size=(15, 15), stride=(1, 1), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): Conv2d(15, 16, kernel_size=(20, 20), stride=(1, 1), padding=(2, 2))
    (3): ReLU(inplace=True)
  )
  (classifier): Sequential(
    (0): Linear(in_features=784, out_features=627, bias=True)
    (1): ReLU(inplace=True)
    (2): Linear(in_features=627, out_features=376, bias=True)
    (3): ReLU(inplace=True)
    (4): Linear(in_features=376, out_features=46, bias=True)
  )
)
94.7029%
Model 4
  CNNDevnagari(

  (features): Sequential(
    (0): Conv2d(1, 15, kernel_size=(15, 15), stride=(1, 1), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): Conv2d(15, 16, kernel_size=(20, 20), stride=(1, 1), padding=(2, 2))
    (3): ReLU(inplace=True)
  )
  (classifier): Sequential(
    (0): Linear(in_features=784, out_features=627, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.2, inplace=False)
    (3): Linear(in_features=627, out_features=376, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.2, inplace=False)
    (6): Linear(in_features=376, out_features=46, bias=True)
  )
)
94.4783%
Model 5
 CNNDevnagari(
  (features): Sequential(
    (0): Conv2d(1, 12, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(12, 16, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=1296, out_features=120, bias=True)
    (1): ReLU(inplace=True)
    (2): Linear(in_features=120, out_features=90, bias=True)
    (3): ReLU(inplace=True)
    (4): Linear(in_features=90, out_features=46, bias=True)
  )
)
95.5797%

Conclusion

We were able to achieve highest accuracy of 96.21% with model described in Experiment# 1. The confusion matrix of Experiment# 1
can be found at: CNNDevnagari_1_cm.png.