Chris Pollett > Students >
Harita

    ( Print View)

    [Bio]

    [Blog]

    [CS297 Proposal]

    [Duolingo Activity List]

    [Feed-ForwardNN Slides-PDF]

    [Tensorflow Slides-PDF]

    [Convolution Neural Network Slides-PDF]

    [Word Embeddings And Recurrent NNS-PDF]

    [Sequence-to-Sequence Learning-PDF]

    [Deliverable 1]

    [Deliverable 2]

    [Deliverable 3]

    [Deliverable 4]

    [CS297 Final Report-PDF]

    [CS298 Proposal]

    [CS298 Final Report-PDF]

    [CS298 Slides-PDF]

CS298 Proposal

AI Quantification of Language Puzzle to Language Learning Generalization

Harita Shroff (harita.shroff@sjsu.edu)

Advisor: Dr. Chris Pollett

Committee Members: Kevin Smith, Younghee Park

Abstract:

Online language learning applications provide users multiple ways/games to learn a new language. Some of the ways include rearranging words in the foreign language sentence, filling in the blanks, providing flashcards and many more. Primarily this research focuses on quantifying the effectiveness of these games in learning a new language. Secondarily my goal for this project is to measure the effectiveness of exercises for transfer learning in machine translation. As part of CS297, I worked on projects involving different artificial intelligence topics. Mainly the focus was on different types of neural networks and their implementations. Language modeling and sequence-to-sequence learning were topics providing possible insight into the end goal of the project.

CS297 Results

  • Implemented the Feed-Forward Neural Network using Tensorflow from Introduction to Deep Learning book by Eugene Charniak.
  • Implemented the Convolution Neural Network using Tensorflow for identifying Gujarati digits from their images.
  • Collected related dataset for the project from Duolingo platform.
  • Implemented word2vec program for finding word embeddings for Gujarati words.

Proposed Schedule

Week 1: Jan 23 - Jan 28Kickoff meeting with Dr. Pollett.
Week 2 - Week 3: Jan 29 - Feb 11Propose a model for finding patterns from the available dataset.
Week 4 - Week 6: Feb 12 - Mar 03Implement the accepted model for finding the online platform efficiency.
Week 7 - Week 11: Mar 04 - Apr 07Improve the model efficiency by tuning model parameters.
Week 12 - Week 16: April 08 - May 05 Prepare the project report and presentation

Key Deliverables:

  • Design
    • Develope a Neural Network which takes in dataset containing user experiences on online learning platforms and quantify its effectiveness in an understandable format. The model would be using Convolution Neural Network.
  • Software
    • Implementation of the CNN in python using Tensorflow to quantify the user experience on the online language learning platforms.
  • Report
    • CS 298 Report.
    • CS 298 Presentation.

Innovations and Challenges

  • Current market research on these online platform consists of pool of users. Involvement of Artificial Intelligence has never been experimented.
  • Datasets acheived as part of CS 297 consist a lot of information. Cleaning of dataset will be a challenge.
  • As Artificial Intelligence has never been used in measuring the effectiveness, finding a pattern will be a difficult task.

References:

[2019] Introduction to Deep Learning. E. Charniak. Publisher. January 2019.

[2015] A Survey of Machine Translation Techniques and Systems for Indian Languages. S. Saini and V. Sahula. IEEE International Conference on Computer Intelligence and Communication Technology. 2015.

[2017] Replication Data for: A Trainable Spaced Repetition Model for Language Learning. Settles and Burr. Harvard Dataverse. 2017.

[2018] Data for the 2018 Duolingo Shared Task on Second Language Acquisition Modeling (SLAM). Settles and Burr. Harvard Dataverse. 2018.

[2015] Efficacy of New Language App. R. Vesselinov and J. Grego. NA. 2015