CS298 Proposal

AI for Classic Video Games Using Reinforcement Learning

Shivika Sodhi (shivika.sodhi@sjsu.edu)

Advisor: Dr. Chris Pollett

Committee Members: Your_Committee.

Abstract:

Deep reinforcement learning is the road to build artificially intelligent machines that can perform tasks similar to that of human beings, without any kind of training. Because of DeepMinds recent research in the field of deep learning [3], computers can now automatically learn to play ATARI games from raw pixels. This project is on similar lines, where we want to build an artificially intelligent agent using a deep learning algorithm called convolutional neural network. This image processing algorithm will be used to approximate a Q function, and thus, rewards will be calculated based on the actions taken by the agent, in a specific environment, i.e., when the agent plays a classic game. Initially the agent will take random decisions while playing the game and screen shots of the same will be taken every one tenth of a second. Those screen shots will then be fed into the neural network to calculate the rewards based on the decisions taken by it and increase in the game score when a particular decision is taken. Once the neural network function is computed, those images will be discarded to save memory. The next iteration of the game will take in inputs from its previous iteration to take the best decision possible while playing it. Our primary motive behind choosing this topic was to understand the impact of deep learning on Artificial Intelligence. Hence, though this project we aim to demonstrate the basic ability of a neural network train an agent to automatically learn to play a classic video game at human level, that has never been played before.

CS297 Results

Designd a python program that can play a classic video game. While the game is being played, the program will take screen shots of it, downscale it and then supply it to the neural network model

Python program to control the game

Simple multi layer AI using Keras (Built on top of TensorFlow and THEANO)

Designed a python program that predicts a digit in real time, while its being drawn by the user

Final project report

Proposed Schedule

Week 1: 08 Feb 2017 - 13 Feb 2017

Discuss the deliverables for CS 298

Week 2,3: 14 Feb 2017 - 27 Feb 2017

Automate playing Archon: Design a program to play Archon where two pawns go to fight mode, quit the game once that mode is reached and then start again

Week 4: 28 Feb 2017 - 06 Mar. 2017

Implement Q Learning Algorithm

Week 5,6: 07 Mar. 2017 - 20 March 2017

Implement Q Learning Algorithm, where policies come from Neural Networks

Week 7: 21 Mar. 2017 - 27 Mar. 2017

Work further on the algorithm

Week 8: 28 Mar. 2017 - 3 Apr. 2017

Reiterate training and testing with improvements and updates

Week 9: 4 Apr. 2017 - 10 Apr. 2017

Final code cleanup and restructuring

Week 10: 11 Apr. 2017 - 17 Apr. 2017

Complete first draft of the report

Week 11: 18 Apr. 2017 - 24 Apr. 2017

Complete next draft of the report

Week 12: 25 Apr. 2017 - 1 May 2017

Finalize the report

Week 13: 02 May 2017 - 07 may 2017

Submit report to committee for feedback

Innovations and Challenges

There hasn't been a successful attempt yet, to learn to play Archon.

It's difficult to train the system using reinforcement learning as there is a time lag between the action and the reward.

The occasional feedback provided by reinforcement learning, to train the agent to make correct decisions is challenging to provide at real time.