Advisor: Dr. Chris Pollett
Committee Members: Kevin Smith (kevin.smith@sjsu.edu), Robert Chun (robert.chun@sjsu.edu)
Abstract:
Converting a visual user interface design created by a designer into computer code is a typical job of a user interface engineer in order to develop beautiful web and mobile applications. This conversion process can often be extremely tedious, slow and prone to human error. In the coming years technology such as deep learning will enable the design of expressive and intuitive products while eliminating hurdles from the product development process. This project aims to understand the various artificial intelligence assistant techniques explored by researchers in this field. I present an end-to-end system that will help streamline and automate the overall product development routine by generating platform ready prototypes directly from design sketches.
CS297 Results:
- Tensorflow implementation of Drawing Classification
This deliverable consisted of researching and implementing Google’s Tensorflow implementation
of Recurrent Neural Networks for Drawing Classification. The recognition is performed by a
classifier that takes the user input, given as a sequence of strokes of points in x and y, and
recognizes the object category that the user tried to draw.
- Hand drawn shape classification using Doodle Classifier
DoodleClassifier is an openFrameworks application, part of the ml4a-ofx collection, which lets
you train a classifier to accurately recognize drawings (“doodles”) from a camera. It was first used
in a project called DoodleTunes by Andreas Refsgaard and Gene Kogan, which used the app to
recognize doodles of musical instruments and turn them into music being made in Ableton Live.
In this deliverable, we train doodle Classifier to learn to recognize and classifiy simple hand drawn
shapes such as triangles, circles and stars.
- UI Component classification using Doodle Classifier
Here we train doodle classifier against UI components for classification. We first define a set of components to train the classifier against. I took inspiration from the Instagram mobile app to define these components which were the ‘top-navigation’, ‘stories’, ‘image card’, ‘bottom-nav’. We draw each of these components on single sheets of paper using thick black markers. We then proceed to train the doodle classifier against each of the classes and repeat the process as defined in above deliverable.
- Web Server to receive and render detections
We build a web server to receive information regarding the detections made.
The web server is built using NodeJS which is a JavaScript runtime built on Chrome's V8
JavaScript engine. The web server uses the OSC (open sound control) protocol to receive data
from the doodle classifier.
Proposed Schedule:
Week 1:
1/27-2/2 | Group meeting |
Week 2:
2/3-2/9 | First meeting - discuss proposal/schedule... |
Week 3:
2/10-2/16 | Finalize CS298 proposal with key deliverables and schedule |
Week 4:
2/17-2/23 | Detailed Metric analysis + review of cs297 deliverable 4. |
Week 5:
2/24-3/2 | Prototype designing language and framework. |
Week 6:
3/3-3/9 | - |
Week 7:
3/10-3/16 | [Deliverable 1] - Prototype designing language. |
Week 8:
3/17-3/23 | Navigation action prototype |
Week 9:
3/24-3/30 | Navigation action prototype |
Week 10:
3/31-4/6 | Work on tesseract OCR library |
Week 11:
4/7-4/13 | [Deliverable 2] - Tesseract Demo |
Week 12:
4/14-4/20 | [Deliverable 3] - Complete end-to-end Web prototype + Preparation for CS298 Report. |
Week 13:
4/21-4/27 | Preparation for CS298 Report and presentation with working demo. |
Week 14:
4/28-5/4 | Preparation for CS298 Report and presentation with working demo. |
Week 15:
5/5-5/11 | Preparation for CS298 Report and presentation with working demo. |
Week 16:
5/12-5/18 | [Deliverable 4] - Summarize research and present to committee. |
Key Deliverables:
- Deliverable 1 - Directive based Prototype designing language.
- A new sketch design language that can be used to attach certain stylistic and event properties or "directives" to existing components.
- The goal of this deliverable is for our openframeworks module to recognize these directives from input images and join it to the component for which it was attached.
- Deliverable 2 - Execute and render detected directives from input sketch images.
- Our web server will receive component and corresponding directive detections
- In this deliverable we render object detections with corresponding directives applied to respective componenets from input sketch images.
- Deliverable 3 - Complete end-to-end Object Detection architecture
- The goal of this deliverable is to present a polished end-to-end object detection system with high accuracy and ease-of-use.
- Deliverable 4 - CS298 Report and Presentation
- Software
- Openframeworks(C++) based component detection of UI-design sketches with add-ons
- NodeJS(Javascript) based web-server to receive detected web component meta data
- Web Front End(Javascript) to render detections in Real Time and surf previously detected user interface designs
- Database(MongoDB) for saving UI sketch prototypes for storing history of design detections from designers.
- Report
- CS298 Report
- CS298 Presentation
Innovations and Challenges:
- This is an innovation worthy of a masters degree given this can be an industry standard prototyping tool with high accuracy and ease-of-use via an end-to-end component detection system. (Deliverable 3)
- A new prototype designing language and framework for designers to use as enhancements to their components such as styling and event-driven properties as seen in (Deliverable 1).
- Running multiple computer vision models as addons on top of basic component detection within either the Openframeworks/NodeJS module (Deliverable 2). Since this will be opensource, developers can build their own add-ons for the system.
References:
[2017] pix2code: Generating Code from a Graphical User Interface Screenshot. Tony Beltramelli. EICS '18 Proceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems. 2018.
[2018] A Neural Representation of Sketch Drawings David Ha and Douglas Eck. International Conference on Learning Representations
[2015] An Introduction to Convolutional Neural Networks. O'Shea, Keiron and Nash, Ryan. ArXiv e-prints. 2015
[1996] Long short-term memory. Neural computation S. Hochreiter and J. Schmidhuber. 1997.
[2014] Long-term recurrent convolutional networks for visual recognition and description J. Donahue, L. Anne Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell. In Proceedings of the IEEE conference on computer vision and pattern recognition