Comp 3211 Final Project - Poker AI

Similar documents
Texas hold em Poker AI implementation:

Simple Poker Game Design, Simulation, and Probability

CS Project 1 Fall 2017

Player Profiling in Texas Holdem

Creating a Poker Playing Program Using Evolutionary Computation

Heads-up Limit Texas Hold em Poker Agent

Optimal Rhode Island Hold em Poker

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi

V. Adamchik Data Structures. Game Trees. Lecture 1. Apr. 05, Plan: 1. Introduction. 2. Game of NIM. 3. Minimax

After receiving his initial two cards, the player has four standard options: he can "Hit," "Stand," "Double Down," or "Split a pair.

BLUFF WITH AI. CS297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University. In Partial Fulfillment

the gamedesigninitiative at cornell university Lecture 6 Uncertainty & Risk

Derive Poker Winning Probability by Statistical JAVA Simulation

THREE CARD POKER. Game Rules. Definitions Mode of Play How to Play Settlement Irregularities

CS221 Final Project Report Learn to Play Texas hold em

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( )

A. Rules of blackjack, representations, and playing blackjack

CS107L Handout 06 Autumn 2007 November 2, 2007 CS107L Assignment: Blackjack

Optimal Yahtzee performance in multi-player games

CSCI 4150 Introduction to Artificial Intelligence, Fall 2004 Assignment 7 (135 points), out Monday November 22, due Thursday December 9

Game Playing for a Variant of Mancala Board Game (Pallanguzhi)

Bonus Side Bets Analysis

"Students play games while learning the connection between these games and Game Theory in computer science or Rock-Paper-Scissors and Poker what s

Exploitability and Game Theory Optimal Play in Poker

Massachusetts Institute of Technology. Poxpert+, the intelligent poker player v0.91

ultimate texas hold em 10 J Q K A

Texas Hold em Poker Basic Rules & Strategy

Welcome to the Best of Poker Help File.

ATHABASCA UNIVERSITY CAN TEST DRIVEN DEVELOPMENT IMPROVE POKER ROBOT PERFORMANCE? EDWARD SAN PEDRO. An essay submitted in partial fulfillment

Blazing 7s Blackjack Progressive

Make better decisions. Learn the rules of the game before you play.

Artificial Intelligence. Minimax and alpha-beta pruning

A Brief Introduction to Game Theory

To play the game player has to place a bet on the ANTE bet (initial bet). Optionally player can also place a BONUS bet.

Fictitious Play applied on a simplified poker game

The first topic I would like to explore is probabilistic reasoning with Bayesian

Five-In-Row with Local Evaluation and Beam Search

CS221 Project Final Report Gomoku Game Agent

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage

An Adaptive Intelligence For Heads-Up No-Limit Texas Hold em

Imperfect Information. Lecture 10: Imperfect Information. What is the size of a game with ii? Example Tree

Anticipation of Winning Probability in Poker Using Data Mining

Blazing 7 s Blackjack Progressive

Reflections on the First Man vs. Machine No-Limit Texas Hold 'em Competition

Monte Carlo based battleship agent

CMSC 671 Project Report- Google AI Challenge: Planet Wars

UNCLASSIFIED UNCLASSIFIED 1

The Odds Calculators: Partial simulations vs. compact formulas By Catalin Barboianu

2048: An Autonomous Solver

Using Fictitious Play to Find Pseudo-Optimal Solutions for Full-Scale Poker

Players try to obtain a hand whose total value is greater than that of the house, without going over 21.

Game Theory. Vincent Kubala

Game Theory. Vincent Kubala

BLUFF WITH AI. A Project. Presented to. The Faculty of the Department of Computer Science. San Jose State University. In Partial Fulfillment

The game of poker. Gambling and probability. Poker probability: royal flush. Poker probability: four of a kind

POKER AGENTS LD Miller & Adam Eck April 14 & 19, 2011

Using Selective-Sampling Simulations in Poker

Poker AI: Equilibrium, Online Resolving, Deep Learning and Reinforcement Learning

CSC 380 Final Presentation. Connect 4 David Alligood, Scott Swiger, Jo Van Voorhis

For this assignment, your job is to create a program that plays (a simplified version of) blackjack. Name your program blackjack.py.

An Introduction to Poker Opponent Modeling

Creating a Dominion AI Using Genetic Algorithms

BLUFF WITH AI. Advisor Dr. Christopher Pollett. By TINA PHILIP. Committee Members Dr. Philip Heller Dr. Robert Chun

Setup. These rules are for three, four, or five players. A two-player variant is described at the end of this rulebook.

Building a Computer Mahjong Player Based on Monte Carlo Simulation and Opponent Models

TABLE OF CONTENTS TEXAS HOLD EM... 1 OMAHA... 2 PINEAPPLE HOLD EM... 2 BETTING...2 SEVEN CARD STUD... 3

Mathematical Analysis Player s Choice Poker

Learning to Play Love Letter with Deep Reinforcement Learning

Search Depth. 8. Search Depth. Investing. Investing in Search. Jonathan Schaeffer

AI in Tabletop Games. Team 13 Josh Charnetsky Zachary Koch CSE Professor Anita Wasilewska

Blackjack Terms. Lucky Ladies: Lucky Ladies Side Bet

Chapter 2. Games of Chance. A short questionnaire part 1

4. Games and search. Lecture Artificial Intelligence (4ov / 8op)

BLACKJACK. Game Rules. Definitions Mode of Play How to Play Settlement Irregularities

3 Millions Internet Poker Players Information Records Revealed Online

ELKS TOWER CASINO and LOUNGE TEXAS HOLD'EM POKER

Learning a Value Analysis Tool For Agent Evaluation

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

For slightly more detailed instructions on how to play, visit:

LEARN HOW TO PLAY MINI-BRIDGE

AI Approaches to Ultimate Tic-Tac-Toe

10, J, Q, K, A all of the same suit. Any five card sequence in the same suit. (Ex: 5, 6, 7, 8, 9.) All four cards of the same index. (Ex: A, A, A, A.

The Secret to Performing the Jesse James Card Trick

Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! Due (in dropbox) Tuesday, September 23, 9:34am

Monte Carlo Tree Search. Simon M. Lucas

game tree complete all possible moves

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Chess Style Ranking Proposal for Run5 Ladder Participants Version 3.2

Texas Hold em Inference Bot Proposal. By: Brian Mihok & Michael Terry Date Due: Monday, April 11, 2005

Models of Strategic Deficiency and Poker

MyPawns OppPawns MyKings OppKings MyThreatened OppThreatened MyWins OppWins Draws

Unfair Advantage and Integrity Policy

Fall 2017 March 13, Written Homework 4

Sport, Trading and Poker

Pengju

CPS331 Lecture: Search in Games last revised 2/16/10

Chapter 6. Doing the Maths. Premises and Assumptions

BRIDGE is a card game for four players, who sit down at a

Machine Learning Othello Project

Analysis For Headstart Hold em Keno April 22, 2005

Transcription:

Comp 3211 Final Project - Poker AI Introduction Poker is a game played with a standard 52 card deck, usually with 4 to 8 players per game. During each hand of poker, players are dealt two cards and must choose whether to bet on their hand, or fold and exit the round at a small loss. Due to the nature of the game and the interaction between players, poker strategy very much revolves around outwitting the opponent. Unlike other betting games like blackjack, there is no single strategy that is optimal at all times and against all opponents. The information available to each poker player is incomplete, and sometimes may be misleading. For example, an opponent may place a large bet; this would indicate that the opponent has a good hand, but in reality that player could be bluffing with a poor hand. For these reasons, an application of artificial intelligence to the game of poker could prove to be useful. The goal of this project is to create an AI system that is capable of beating a poker bot that implements a simple formula over an extended number of rounds. Approach The first step in the process to find a poker framework in java to modify. Several modifications were made to the game in order to simplify the task: - The game was reduced to 2 players: an AI system, and a basic bot - The number of betting phases was reduced from 4 (regular poker) to 1 - The betting options for each player were reduced from an analog system (players can bet however much they want) to a binary system (players can only bet a low value or a high value) - The AI was always given the responding turn. This means that it always had the knowledge of the bot s bet before making its decision. Given the nature of the problem, it was decided that Q-learning would be the most effective method for the AI system. Q-learning will be able to adapt to the different strategies of different opponents and will overcome the noise in the data produced by the randomness of the card draws for each player. Action Algorithm In each round, the AI system must take an action based on the information available to it. Every round, it knows the cards in it s own hand, and it knows whether the opponent bot placed a high bet or a low bet. For the purpose of the algorithm, this was defined in the program as a state: 1

The information contained in a state is then passed into the action algorithm. The action algorithm creates 2 alternate possibilities based on the state: one where it folds on the state, and one where it bets on the state. These two alternate possibilities are called state-actions. The AI will check its database to see if either of the state-actions have already previously occurred, and find the associated value of each state action. Generally, it will return the action associated with a higher value: If the learned value of betting is greater than the value of folding, then the AI will bet. However, even if the learned value of betting is lower than the value of folding, 10% of the time the AI will still bet. The reason for this is that it takes several data points to accurately determine the value of a state-action. Each state-action has a probability of winning, and must be tried several times before eliminating an action as a viable possibility from that state. If the AI doesn't retry betting on states where it has previously lost, then it will be folding excessively. One thing that is important to note is that only the rank of each card in the hand was considered, not the suit. This was done to reduce the total number of possible states. There are 2652 combinations of 2 cards when suit is considered, but only 169 combinations... if only card rank is considered. 2

Learning Algorithm The AI must learn from the results of each round. In order to do this, it stores a key-value pair in a hashmap after every round, where the key is a state-action, and the value is the result of the round. A hashmap is used so that the lookup time to find the learned value of a state-action remains constant. The alpha value of 0.05 was chosen after experimenting with a range of values. Larger alpha values result in the AI eliminating the option of betting from certain states too quickly. Opponent Bot In order to test the effectiveness of the AI, a very basic bot was created as an opponent: The chen formula is used by the bot to evaluate the strength of its hand. The chen formula is regularly used by professional poker players as a basic indicator of hand strength. The bot has 2 variables that can be modified: the chen formula score required to bet, and the randomness. These parameters were designed for easy modification so that the AI could be tested against a wider variety of types of opponent. 3

Results : In order to test the AI s learning speed as well as adaptability against various opponents, it played 100 games of poker, each consisting of a number of rounds between 100 and 10000. For each combination, the average score per round was recorded. A range for the average score is indicated. The results are as follows: Opponent 100 games of 100 rounds 100 games of 1000 rounds 100 games of 10000 rounds 100% random bot -0.10 to 0.10 0.04 to 0.16 0.08 to 0.10 40% random bot Minimum chen score = 3 40% random bot Minimum chen score = 8 0% random bot Minimum chen score = 3 0% random bot Minimum chen score = 8 0.08 to 0.30 0.16 to 0.27 0.2 to 0.22 0.03 to 0.20 0.15 to 0.20 0.16 to 0.19 0.05 to 0.42 0.23 to 0.33 0.3 to 0.32 0.06 to 0.31 0.17 to 0.28 0.24 to 0.26 Analysis : The first thing to note is that the lower limit of performance for the AI always increases as the number of rounds per game increases. This is as expected, because the values of each state-action can be more accurately derived if there are more data points for each state-action. Because there are Another correlation that can be observed is the effect of the bot s minimum chen score on the the AI s performance. When the bot has a minimum chen score of 3, it bets more frequently and on weaker hands. Once the AI has gathered enough information, it can realize that the bot bets high even on weaker hands. A human player might be intimidated by a high bet, but in this scenario the AI learns the trends of the bot and will respond by betting. This leads to a greater average score per round. On the other hand, when the bot has a minimum chen score of 8, the AI s performance actually decreases. Although the AI is likely winning a similar number of rounds as the previous scenario, the bot is betting high far less frequently, so the potential winnings for the AI are also smaller. 4

The last and most interesting trend in the results is the effect of the bot s randomness on the AI s performance. Although one would describe the chen formula bot with 0% randomness to be smarter than the 100% randomness bot, the AI actually has the worst performance against the most random bot, and the best performance against the least random bots. This is because the AI cannot learn anything meaningful from an opponent that acts randomly. However, when an opponent acts in a predictable manner, the AI can use this to it s advantage to predict the strength of the opponent's hand, and decide whether to bet or not. Conclusion The AI system designed for the simplified poker game was able to achieve positive scores against all the different bots that it played against. However, it did require a large number of rounds to be played for optimal performance. One thing that could be done would be to increase the alpha value for the learn algorithm, while also forcing the AI to retry state-actions with a learned negative value. Another area of improvement for the AI is playing against seemingly random opponents. Professional poker players change their strategies frequently to make sure that their opponents cannot decipher their actions; the AI must be able to overcome this. However, the AI must also become random itself, in order to be an effective opponent against human players. The current state of this project s AI is very predictable. A further step would be to make its actions seem more random without compromising the value of each move. 5