Solution to Heads-Up Limit Hold Em Poker

Size: px

Start display at page:

Download "Solution to Heads-Up Limit Hold Em Poker"

Whitney Shanon Blair
6 years ago
Views:

1 Solution to Heads-Up Limit Hold Em Poker A.J. Bates Antonio Vargas Math 287 Boise State University April 9, 2015 A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

2 Outline Introduction Solving Imperfect-Information Games Normal-Form Linear Programming Sequence-Form Linear Programming Counterfactual Regret Minimization Solving heads-up limit holdem The Solution Conclusion A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

3 Intro to Heads-up Limit Hold Em Poker HULHE Non-Perfect Information Game Two Player fixed bet size fixed number of raises (limit) A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

4 Solving Imperfect-Information Games Extensive-form game Game tree s depict possible moves Zero-Sum Nash Equilibrium: The optimal move taking into account the opponents choices of play A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

Normal-Form Linear Programming Earliest Method of Solving was converting the Extensive-Form game to normal-form A matrix of values for every pair of possible strategies

5 Normal-Form Linear Programming Earliest Method of Solving was converting the Extensive-Form game to normal-form A matrix of values for every pair of possible strategies Number of possible deterministic strategies is exponential A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

6 Sequence-Form Linear Programming First algorithm to solve imperfect-information extensive-form through computation Representing strategy through sequence form This technique was used to create the first poker playing program A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

7 Counterfactual Regret Minimization Iterative method for approximating a Nash Equilibrium repeated self play between two regret-minimizing algorithms stores and minimizes a modified regret for each information set and subsequent action By averaging each player s strategy over all the iterations, the Nash Equilibrium can be found A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

8 Solving heads-up limit holdem The full game of HULHE has 3.19e14 information sets. With CFR, this requires 262 TB of storage and an impractical amount of computation! A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

9 Solving heads-up limit holdem CFRplus Does exhaustive iterations across entire game tree Favorable actions repeated immediately Exploitability of players strategies approach zero (no need for averaging of strategies) A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

10 Solving heads-up limit holdem Exploitability Expoitability: the amount less than the game value that the strategy achieves against the worst-case opponent strategy in expectation A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

11 Solving heads-up limit holdem Exploitability A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

12 Solving heads-up limit holdem Essentially solved We define a game to be essential solved if a lifetime of play is unable to statistically differentiate it from being solved at 95 percent confidence. A lifetime of play is defined as someone playing 200 games of poker an hour for 12 hours a day without missing a day for 70 years. Threshold is an exploitability of 1 milli-big-blinds per game. A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

13 The solution Computation CFRplus executed on cluster of 200 computation nodes with GHz AMD cores, 32 GB of RAM, and a 1-TB local disk. Computation ran for 1579 iterations, taking 68.5 days! A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

14 The solution Action probabilities A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

15 The solution Tips from strategy Dealer advantage Limping (passing on first raise) discouraged Almost never caps (making final allowed raise) in first round Most importantly, nondealer plays much more often A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

16 Conclusion Game theory has been used to analyze Cold War politics, with potential for CFR to be applied in security and in the medical field It would be disingenuous of us to disguise the fact that the principal motive which prompted the work was the sheer fun of the thing -Turing A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

17 The paper referenced Bowling, M., Burch, N., Johanson, M., Tammelin, O. (January 08, 2015). Heads-up limit hold em poker is solved. Science, 347, 6218, A.J. Bates, Antonio Vargas (Boise State University) Solution to Heads-Up Limit Hold Em Poker April 9, / 17

DeepStack: Expert-Level AI in Heads-Up No-Limit Poker. Surya Prakash Chembrolu

DeepStack: Expert-Level AI in Heads-Up No-Limit Poker Surya Prakash Chembrolu AI and Games AlphaGo Go Watson Jeopardy! DeepBlue -Chess Chinook -Checkers TD-Gammon -Backgammon Perfect Information Games