Lang
CS298 Proposal
Image-based Localization of User-Interfaces
Riti Gupta (riti.gupta@sjsu.edu)
Advisor: Dr. Chris Pollett
Committee Members: Dr. Fabio Di Troia, Dr. Robert Chun
Abstract:
There is an increasing need to make web data available in all the languages so that people all
over the world can understand it. Most web data is still available in English only. Web data can be
available in various formats, it can be text, images, books and sound. The aim of the research project is
to study the translation of web interfaces screenshots from English to Hindi. A lot of work has been done
to translate web data from one language to another to make it globally available but the context of the
image is not taken into consideration. In this research, the context of the image would also be taken into
account for text translation checking if the accuracy of translation improves further. This can be
extended to other languages apart from Hindi and English in the future.
Deliverables:
- Design
- Design a neural network taking UI with English text screenshots as input and translating
them into Hindi. The model would be a Convolutional Neural Network (CNN) containing
layers to avoid overfitting and extract relevant features using various filters.
- Software
- Python based machine learning model to localize UI with English text and translate them
into Hindi.
- Implement algorithm to determine accuracy of the model developed by computing the
pixel mismatch of the translated UI screenshot.
- Report
- CS 298 report.
- CS 298 presentation.
Innovations and Challenges:
- Generating the dataset is a challenge as corresponding images in Hindi and English are not easily
available on websites. Some pages are not crawlable which alleviates the issue.
- Developing an architecture that at least gives the accuracy as given by current translation
software.
- Taking the context of image into consideration while translating.
Schedule:
Aug 28-Sept 10 | Collect dataset containing UI with English text screenshots and corresponding screenshots with Hindi text |
Sept 11-Sept 17 | Propose a CNN model for translation of UI screenshots from English to Hindi |
Sept 18-Oct 1 | Implement the model proposed and validate on the dataset collected by comparing the pixel of translated screenshot. |
Oct 2-Oct 15 | Improve the accuracy by performing necessary transformations by tuning the parameters. |
Oct 16-Nov 12 | Write CS298 report and prepare slides |
Literature References:
[1] S. Saini and V. Sahula, "A Survey of Machine Translation Techniques and Systems for Indian Languages," in IEEE Int. Conf. on Comp. Int. & Comm. Tech., 2015.
[2] H.A. Driss, S. ELFKIHI and A. Jilbab, "Features Extraction for Text Detection and Localization," in 5 th Int. Symp. On I/IV Comm. And Mobile Network, 2010.
[3] C.M. Thillou and B. Gosselin, Natural Scene Understanding, https://www.tcts.fpms.ac.be/publications/regpapers/2007/VS_cmtbg2007.pdf
[4] X. Zhou, et al., "EAST: An Efficient and Accurate Scene Text Detector," 1704.03155v2 [cs.CV] 10 Jul 2017.
[5] E. Charniak, Introduction to Deep Learning, ISBN: 9780262039512192 pp. | 7 in x 9 in75 b&willus. January 2019.
[6] O. Rippel and L. Bourdev, "Real-Time Adaptive Image Compression," The 34th Int. Conf. on Mach. Learn., 2017. doi: arXiv:1705.05823v1.
[7] G. Toderici et al., "Full Resolution Image Compression with Recurrent Neural Networks," arXiv e-
prints.,2016. doi: arXiv:1608.05148.
[8] T. Law, H. Itoh and H. Seki, "A neural-network assisted Japanese-English machine translation
system," in Proceedings of 1993 Int. Conf. on Neural Networks.
[9] Md. M. Hossain, K.E.U Ahmed and A.R Uddin, "English to Bangla Translation in Structural Way
Using Neural Networks," in 2009 Int. Conf. on Information and Multimedia Tech. |