CS297 Proposal

Image-based Localization of User-Interfaces

Riti Gupta (riti.gupta@sjsu.edu)

Advisor: Dr. Chris Pollett

Description:

The aim of the project is to translate images of user interfaces in different languages to English. There would be two aspects of the project. First, identifying the text from the image. Second, translating the text to English. The project would be implemented using AI and machine learning techniques. We will explore neural networks which are widely used for image classification and detection problems.

Schedule:

Week 1: Jan.29-Feb.4	Kickoff meeting and topic discussion
Week 2: Feb.5-Feb.11	Proposal Draft and decide on sub tasks for the project
Week 3: Feb.12-Feb.18	Work on developing neural network to detect suit and value in playing card images
Week 4: Feb.19-Feb.25	Improve the accuracy of the CNN model developed
Week 5: Feb.26-Mar.4	Finish implementation [Deliverable 1]
Week 6: Mar.5-Mar.11	Develop a project to determine text surrounded by a boundary in various images.
Week 7: Mar.12-Mar.18	Finish the implementation [Deliverable 2]
Week 8: Mar.19-Mar.25	Develop a text extraction project from images
Week 9: Mar.26-Apr.1	Continue implementation
Week 10: Apr.2-Apr.8	Finish the implementation [Deliverable 3]
Week 11: Apr.9-Apr.15	Study dataset generation ideas for the project
Week 12: Apr.16-Apr.22	Continue exploring dataset generation ideas [Deliverable 4]
Week 13: Apr.23-Apr.29	Read papers on text translation
Week 14: Apr.20-May.6	Continue reading and present relevant papers for the project
Week 15: May.7-May.13	Start to write proposal report
Week 16: May.14-May.20	Finish writing proposal report

Deliverables:

The full project will be done when CS298 is completed. The following will be done by the end of CS297:

1. Develop a project to detect the suit and number on playing cards using images. The aim of the project is to get familiar with neural networks.

2. Develop a project to determine text surrounded by a boundary in various images. The aim of the project is to get familiar with openCV.

3. Develop a text extraction project from various images.

4. Explore ideas to generate the dataset suitable for this project by using headlist browser and python

5. CS297 proposal report writeup

References:

[1] S. Saini and V. Sahula, "A Survey of Machine Translation Techniques and Systems for Indian Languages," in IEEE Int. Conf. on Comp. Int. "<" Comm. Tech., 2015.

[2] H.A. Driss, S. ELFKIHI and A. Jilbab, "Features Extraction for Text Detection and Localization," in 5th Int. Symp. On I/IV Comm. And Mobile Network, 2010.

[3] C.M. Thillou and B. Gosselin, Natural Scene Understanding, https://www.tcts.fpms.ac.be/publications/regpapers/2007/VS_cmtbg2007.pdf

[4] X. Zhou, et al., "EAST: An Efficient and Accurate Scene Text Detector," 1704.03155v2 [cs.CV] 10 Jul 2017.

[5] E. Charniak, Introduction to Deep Learning, ISBN: 9780262039512192 pp. | 7 in x 9 in75 b"<"w illus. January 2019.

[6] O. Rippel and L. Bourdev, "Real-Time Adaptive Image Compression," The 34th Int. Conf. on Mach. Learn., 2017. doi: arXiv:1705.05823v1.

[7] G. Toderici et al., "Full Resolution Image Compression with Recurrent Neural Networks," arXiv e-prints.,2016. doi: arXiv:1608.05148.