Chris Pollett > Students > Philip
[Bio] [Blog] [Del-1: Base classes for Bluff] [Del-4: Presentation on Poker AI players] |
CS298 ProposalBluff with AITina Philip (talk2tinaphilip@gmail.com) Advisor: Dr. Chris Pollett Committee Members: Dr. Robert Chun, Dr. Philip Heller Abstract:The goal of this project is to build an AI that learns how to play Bluff. Bluff is a multi-player card game in which each player makes a sequence of decisions based on a partially-observed game state that evolves under uncertainty. The main challenge of this game is that it is a game of imperfect information as each player knows nothing about the other player's hand. The back bone of the implementation is a neural network with back-propagation. Reinforcement learning is one powerful paradigm for doing so. It allows software agents to automatically determine the ideal behavior within a specific context, in order to maximize performance. Simple reward feedback is provided for the agent to learn its behavior known as the reinforcement signal. We will approximate a Q function and rewards will be calculated based on the actions taken by the agent during the game.The next iteration of the game will take in inputs from its previous iteration to take the best decision possible. Through this project we try to understand the impact of deep learning on games and aims to demonstrate the basic ability of a neural network to train an agent to automatically learn to play a card game. CS297 Results
Proposed Schedule
Key Deliverables:
Innovations and Challenges
References:[1] Hurwitz, Evan, and Tshilidzi Marwala. "Learning to bluff." Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on. IEEE, 2007. Available: https://pdfs.semanticscholar.org/ff49/bcf422168c6bfe4f115f02d098ad7bf49065.pdf [2] Darse Billings. "Algorithms and Assessment in Computer Poker, " Ph.D. Dissertation, 2006, University of Alberta, Edmonton, Alta., Canada. AAINR22991. [3] Russell, Stuart, Peter Norvig, and Artificial Intelligence. "A modern approach."Artificial Intelligence. Prentice-Hall, Egnlewood Cliffs 25 (1995): 27. Available: https://pdfs.semanticscholar.org/137b/8be0b10c645bb4ec56a4eac3958d4a60ac6a.pdf [4] Pollett, C. (2015, Feb 2), Random Permutations, the Birthday Problem, Ball and Bins Arguments. [Powerpoint slides]. Retrieved from: http://www.cs.sjsu.edu/faculty/pollett/255.1.15s/Lec02022015.html#(1) [5] Eastaugh, B. (2014, Feb 8). The Mathematics of Bluffing [Blog post]. Retrieved from https://ibmathsresources.com/2014/02/08/the-mathematics-of-bluffing/ [6] Moravcik, M., Schmid, M., Burch, N., Lisy, V., Morrill, D., Bard, N and Bowling, M. (2017). DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker. arXiv preprint arXiv:1701.01724. |