Deliverable 1: Implement CNN to classify Devanagari characters.
Data Collection:
- Dataset is available at: https://www.kaggle.com/rishianand/devanagari-character-set
- It comprises 92000 images [32x32 px] corresponding to 46 characters, consonants "ka" to "gya", and the digits 0 to 9.
- 33 original Devanagari alphabets + 3 Nepali alphabets + 10 digits = 46 characters
- The vowels are missing.
Experiments with CNN models:
Experiments described in the following table are performed with the CNN model to achieve high accuracy.
PyTorch implementation can be found at:devnagari-classification-.zip
Sr# | Model description | Test Accuracy achieved |
Model 1 |
CNNDevnagari(
(conv1): Conv2d(1, 12, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
(conv2): Conv2d(12, 16, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
(fc1): Linear(in_features=1296, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=90, bias=True)
(fc3): Linear(in_features=90, out_features=46, bias=True)
)
| 96.2101% |
Model 2 |
CNNDevnagari(
(features): Sequential(
(0): Conv2d(1, 15, kernel_size=(15, 15), stride=(1, 1), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(15, 16, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=576, out_features=460, bias=True)
(1): ReLU(inplace=True)
(2): Linear(in_features=460, out_features=276, bias=True)
(3): ReLU(inplace=True)
(4): Linear(in_features=276, out_features=46, bias=True)
)
)
| 95.3261% |
Model 2 |
CNNDevnagari(
(features): Sequential(
(0): Conv2d(1, 15, kernel_size=(15, 15), stride=(1, 1), padding=(2, 2))
(1): ReLU(inplace=True)
(2): Conv2d(15, 16, kernel_size=(20, 20), stride=(1, 1), padding=(2, 2))
(3): ReLU(inplace=True)
)
(classifier): Sequential(
(0): Linear(in_features=784, out_features=627, bias=True)
(1): ReLU(inplace=True)
(2): Linear(in_features=627, out_features=376, bias=True)
(3): ReLU(inplace=True)
(4): Linear(in_features=376, out_features=46, bias=True)
)
)
| 94.7029% |
Model 4 |
CNNDevnagari(
(features): Sequential(
(0): Conv2d(1, 15, kernel_size=(15, 15), stride=(1, 1), padding=(2, 2))
(1): ReLU(inplace=True)
(2): Conv2d(15, 16, kernel_size=(20, 20), stride=(1, 1), padding=(2, 2))
(3): ReLU(inplace=True)
)
(classifier): Sequential(
(0): Linear(in_features=784, out_features=627, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.2, inplace=False)
(3): Linear(in_features=627, out_features=376, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.2, inplace=False)
(6): Linear(in_features=376, out_features=46, bias=True)
)
)
| 94.4783% |
Model 5 |
CNNDevnagari(
(features): Sequential(
(0): Conv2d(1, 12, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(12, 16, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): Linear(in_features=1296, out_features=120, bias=True)
(1): ReLU(inplace=True)
(2): Linear(in_features=120, out_features=90, bias=True)
(3): ReLU(inplace=True)
(4): Linear(in_features=90, out_features=46, bias=True)
)
)
| 95.5797% |
Conclusion
We were able to achieve highest accuracy of 96.21% with model described in Experiment# 1. The confusion matrix of Experiment# 1
can be found at: CNNDevnagari_1_cm.png.
|