CS297 Proposal

AI for Classic Video Games Using Reinforcement Learning

Shivika Sodhi (shivika.sodhi@sjsu.edu)

Advisor: Dr. Chris Pollett

Description:

Reinforcement learning is essential for training an agent to make smart decisions under uncertainty and to take small actions in order to achieve a higher overarching goal. In this project, we combined reinforcement learning and deep learning techniques to train an agent to play the game, Archon. The challenge is that the agent only sees the pixels and the rewards, similar to a human player. Using just this information, it is able to successfully play the game at a human or sometimes super-human level.. The input (training set) will be screen shots of the game (every one-tenth of a second) while it's being played by the computer. These screen shots shall be discarded after the processing is done. Based on this model, we are planning to make the computer make smarter critical decisions in order to play the game well.

Schedule:

Week 1: 06 Sep 2016 - 12 Sep 2016	Try to understand research papers
Week 2: 13 Sep 2016 - 19 Sep 2016	Design a program in Python that takes screen shots and stores the image in a scaled down manner
Week 3: 20 Sep 2016 - 26 Sep 2016	Deliverable 1: Add certain features to the program like image downscaling.
Week 4: 27 Sep 2016 - 03 Oct 2016	Read Chapter 17 and 21 from AIMA
Week 5: 04 Oct 2016 - 10 Oct 2016	Presentation on "Reinforcement Learning: An Introduction"
Week 6: 11 Oct 2016 - 17 Oct 2016	Deliverable 2: Figure out a way to control the running game from a computer program
Week 7: 18 Oct 2016 - 31 Oct 2016	Presentation on "Temporal difference learning and td-gammon" and " Q-learning. Machine learning"
Week 8: 25 Oct 2016 - 31 Oct 2016	Presentation on "Q-learning. Machine learning"
Week 9,10: 01 Nov 2016 - 14 Nov 2016	Deliverable 3: Design simple multi layer AI using THEANO.
Week 11: 15 Nov 2016 - 21 Nov 2016	Presentation on "Playing Atari with Deep Reinforcement Learning"
Week 12, 13: 22 Nov 2016 - 05 Dec 2016	Deliverable 4: Design a program that labels in real time, what the current game context/position is. For example, splash screen, player selection, player turn and combat mode.
Week 14: 06 Dec 2016 - 12 Dec 2016	Deliverable 5: CS 297 Report.

Deliverables:

The full project will be done when CS298 is completed. The following will be done by the end of CS297:

1. Design a program that can take a screen capture of the region of the screen periodically every (5) seconds

2. Figure out a way to control the running game from a computer program (fakes as if there's a joystick being used by a person) using neural networks. Existing python api (simulate keystrokes and mouse movements).

3. Design simple multi layer AI using THEANO.

4. Design a program that labels in real time, what the current game context/position is. For example, splash screen, player selection, player turn and combat mode.

5. CS 297 Report.

References:

Playing Atari with Deep Reinforcement Learning: https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf

Temporal difference learning and td-gammon: