Deliverable 4: Implement Full Resolution Image Compression using Recurrent Neural Networks

Description:

The objective of this deliverable was to implement the architecture suggested in the paper [1]. This architecture provides variable compression rates. There are two kinds of architectures possible for reconstruction network  One shot and additive construction. This is a first architecture to outperform JPEG across most bitrates.
Even though this architecture uses Mean Square Error to find out loss between an image. Various image quality measurements such as Structural Similarity Matrix (SSIM) and Peak Signal to Noise Ratio (PSNR) was also evaluated.
Because of multiple invocations of recurrent neural networks layers GPU usage in google colab far exceeded the one required for this network to run. Hence, google cloud platform was used to run the compression network.
Since RNN and CNN are implemented together, according to the code as below:
hx, cx = hidden
gates = self.conv_ih(input) + self.conv_hh(hx)
ingate, forgetgate, cellgate, outgate = gates.chunk(4, 1)
ingate = F.sigmoid(ingate)
forgetgate = F.sigmoid(forgetgate)
cellgate = F.tanh(cellgate)
outgate = F.sigmoid(outgate)
cy = (forgetgate * cx) + (ingate * cellgate)
hy = outgate * F.tanh(cy)
This architecture does not take into account the human vision centric image accuracy matrices such as PSNR-HVS.

Example:

Original Image Reconstructed image after first iteration Reconstructed image after second iteration
     

References :

  1. G. Toderici et al., Full Resolution Image Compression with Recurrent Neural Networks, arXiv e-prints.,2016. doi: arXiv:1608.05148.