Chris Pollett > Students > Philip

Print View

[Bio]

[Blog]

# Deliverable 4: A Study on Strategies used for Poker AI players.

Description:

Heuristic approach for Bluff:

Bluff and Poker are similar in that, at any stage the players only have partial knowledge about the current state of the game. Incomplete or unreliable knowledge and risk management are factors that make Poker an interesting study for our project. The Computer Poker Research Group (CPRG) of the University of Alberta has contributed greatly towards the study of poker academically. They implemented a neural network for predicting the opponents next move [2].

Koller in his paper "Generating and solving imperfect information games" states that just as there is an optimal move for every perfect information game, existence of optimal strategies can be proved if randomized strategies are allowed. Optimal strategy ensures that the opponent gains no advantage over the player even if his strategy is revealed. It can be shown that the theoretical maximum guaranteed profit from a given poker situation can be attained by bluffing, or calling a possible bluff, with a predetermined probability. The relative frequency of bluffing or calling a bluff is based only on the size of the bet in relation to the size of the pot. To ensure maximum profit, the move must be unpredictable.

Consider an example with 2 players in a game of pot-limit Draw poker where player 2 has called a flush with a draw of just one card against player 1 who has the best hand. Player 2 wins showdown (revealing of cards played) if she makes the flush, but loses otherwise. This situation is similar to when an opponent calls bluff and the player has to display the cards. If the cards are played right, the player is safe, but otherwise he loses and has to take the pile. The example further assumes that the probability of completing the flush is exactly 0.2 or one in five, since 5 cards make a flush. Game theoretic analysis can be used to calculate the number of bluffs and the numbers of calls that helps to build an optimal strategy. For each pair of frequencies, the overall expectation can be calculated. This is very different from maximal strategies which aim to exploit weakness in an opponents strategy.

Currently we pursue the optimal strategy which can be used in Bluff as well, where we determine the probabilities of each card being in a particular hand. We know with certainty which cards are in our hand and which cards have been played. We determine the probability of all other cards based on cards that have been played. To start with, we could just use a naive and equal probability that a card is in some player's hand. If there are four players, each player have a probability of 1/39 or 0.025 to have a card, except for those cards which are held by the player himself with a probability of 1. Player 2 can call bluff if a threshold is met that a certain player does not have that card.

A table can be maintained to calculate the probabilities. Whenever a player is caught, we gain insight to the cards he gets. We know that the loser will from now on possess all the cards in the discard pile with a probability of one. This discard pile has the cards that we played in previous turns and we can also keep a count on the number of cards in the discard pile.