Robust Algorithms For Game Play Against Unknown Opponents. Nathan Sturtevant University of Alberta May 11, 2006

Similar documents
Robust Game Play Against Unknown Opponents

Prob-Max n : Playing N-Player Games with Opponent Models

Leaf-Value Tables for Pruning Non-Zero-Sum Games

Game Theory. Vincent Kubala

Game Theory. Vincent Kubala

Domination Rationalizability Correlated Equilibrium Computing CE Computational problems in domination. Game Theory Week 3. Kevin Leyton-Brown

final examination on May 31 Topics from the latter part of the course (covered in homework assignments 4-7) include:

ECO 463. SimultaneousGames

Last-Branch and Speculative Pruning Algorithms for Max"

Basic Game Theory. Economics Auction Theory. Instructor: Songzi Du. Simon Fraser University. September 7, 2016

Chapter 13. Game Theory

LECTURE 26: GAME THEORY 1

Introduction to Game Theory

Game Theory and Algorithms Lecture 3: Weak Dominance and Truthfulness

Econ 302: Microeconomics II - Strategic Behavior. Problem Set #5 June13, 2016

1. Simultaneous games All players move at same time. Represent with a game table. We ll stick to 2 players, generally A and B or Row and Col.

CS510 \ Lecture Ariel Stolerman

Game Theory: The Basics. Theory of Games and Economics Behavior John Von Neumann and Oskar Morgenstern (1943)

CMU-Q Lecture 20:

ECO 220 Game Theory. Objectives. Agenda. Simultaneous Move Games. Be able to structure a game in normal form Be able to identify a Nash equilibrium

CS188 Spring 2010 Section 3: Game Trees

(a) Left Right (b) Left Right. Up Up 5-4. Row Down 0-5 Row Down 1 2. (c) B1 B2 (d) B1 B2 A1 4, 2-5, 6 A1 3, 2 0, 1

NORMAL FORM (SIMULTANEOUS MOVE) GAMES

Games. Episode 6 Part III: Dynamics. Baochun Li Professor Department of Electrical and Computer Engineering University of Toronto

Normal Form Games: A Brief Introduction

CS188 Spring 2010 Section 3: Game Trees

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

U strictly dominates D for player A, and L strictly dominates R for player B. This leaves (U, L) as a Strict Dominant Strategy Equilibrium.

Imperfect Information. Lecture 10: Imperfect Information. What is the size of a game with ii? Example Tree

Games of Perfect Information and Backward Induction

Homework 5 Answers PS 30 November 2013

Algorithmic Game Theory and Applications. Kousha Etessami

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Multi-player, non-zero-sum games

ARTIFICIAL INTELLIGENCE (CS 370D)

Section Notes 6. Game Theory. Applied Math 121. Week of March 22, understand the difference between pure and mixed strategies.

Generalized Game Trees

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India August 2012

Comparing UCT versus CFR in Simultaneous Games

Summary Overview of Topics in Econ 30200b: Decision theory: strong and weak domination by randomized strategies, domination theorem, expected utility

The Game of Hog. Scott Lee

Noncooperative Games COMP4418 Knowledge Representation and Reasoning

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

Introduction to Game Theory I

SF2972 Game Theory Written Exam March 17, 2011

Data Biased Robust Counter Strategies

Reading Robert Gibbons, A Primer in Game Theory, Harvester Wheatsheaf 1992.

Partial Answers to the 2005 Final Exam

Game Theory Refresher. Muriel Niederle. February 3, A set of players (here for simplicity only 2 players, all generalized to N players).

Understanding the Success of Perfect Information Monte Carlo Sampling in Game Tree Search

INSTRUCTIONS: all the calculations on the separate piece of paper which you do not hand in. GOOD LUCK!

Strategy Evaluation in Extensive Games with Importance Sampling

Speeding-Up Poker Game Abstraction Computation: Average Rank Strength

THEORY: NASH EQUILIBRIUM

ECON 282 Final Practice Problems

Exercises for Introduction to Game Theory SOLUTIONS

Extensive Games with Perfect Information A Mini Tutorial

Game Theory Lecturer: Ji Liu Thanks for Jerry Zhu's slides

Part 1. Midterm exam PS 30 November 2012

Prisoner 2 Confess Remain Silent Confess (-5, -5) (0, -20) Remain Silent (-20, 0) (-1, -1)

Rational decisions in non-probabilistic setting

Multiple Agents. Why can t we all just get along? (Rodney King)

Fictitious Play applied on a simplified poker game

CSC 380 Final Presentation. Connect 4 David Alligood, Scott Swiger, Jo Van Voorhis

game tree complete all possible moves

ECO 199 B GAMES OF STRATEGY Spring Term 2004 B February 24 SEQUENTIAL AND SIMULTANEOUS GAMES. Representation Tree Matrix Equilibrium concept

Introduction to Auction Theory: Or How it Sometimes

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Game Theory for Strategic Advantage Alessandro Bonatti MIT Sloan

Artificial Intelligence. Minimax and alpha-beta pruning

Instability of Scoring Heuristic In games with value exchange, the heuristics are very bumpy Make smoothing assumptions search for "quiesence"

CMS.608 / CMS.864 Game Design Spring 2008

Learning to play Dominoes

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage

Games in Extensive Form

Student Name. Student ID

CMU Lecture 22: Game Theory I. Teachers: Gianni A. Di Caro

Optimal Rhode Island Hold em Poker

Learning a Value Analysis Tool For Agent Evaluation

The Player Of Games Culture 2 Iain M Banks

8.F The Possibility of Mistakes: Trembling Hand Perfection

Some introductory notes on game theory

DECISION MAKING GAME THEORY

Applying Equivalence Class Methods in Contract Bridge

Simple Poker Game Design, Simulation, and Probability

Chapter 15: Game Theory: The Mathematics of Competition Lesson Plan

Dynamic games: Backward induction and subgame perfection

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar

NORMAL FORM GAMES: invariance and refinements DYNAMIC GAMES: extensive form

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

The first topic I would like to explore is probabilistic reasoning with Bayesian

Land Reform in Africa: No Intervention Agreements

Minmax and Dominance

AN ALGORITHMIC SOLUTION OF N-PERSON GAMES. Carol A. Luckhardt and Keki B. Irani

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Algorithms and Game Theory Date: 12/4/14

Belief-based rational decisions. Sergei Artemov

CS 188: Artificial Intelligence Spring Game Playing in Practice

EC3224 Autumn Lecture #02 Nash Equilibrium

ECO 5341 Strategic Behavior Lecture Notes 3

February 11, 2015 :1 +0 (1 ) = :2 + 1 (1 ) =3 1. is preferred to R iff

Transcription:

Robust Algorithms For Game Play Against Unknown Opponents Nathan Sturtevant University of Alberta May 11, 2006

Introduction A lot of work has gone into two-player zero-sum games What happens in non-zero sum games and multi-player games? Actual games Robotic teams Perfect-information extensive-form

Multi-Player Games Maxn algorithm Luckhardt and Irani, 1986 n-tuple of scores/utilities One value for each player, eg (, 5, 7)

Maxn Decision Rule (, 5, 2) 1 (, 5, 2) (2, 6, 2) 2 (, 5, 2) 2 (4,, ) (1,, 6) (2, 6, 2)

Maxn Computation Maxn computes an equilibrium strategy If all players were given the strategy, nobody would have incentive to change Assumes: All utilities known exactly Tree analyzed completely Players choose common strategy Strategies cannot be changed

Sample Domain: Spades Spades Trick-based card game Use -player variation Many similar card games Tricks Hands Game

Spades Rules - 1 Hand Cards dealt to players Players bid how many tricks they will take After playing the hand: -10xbid if bid is missed (eg bid 5 take 4) 10xbid if bid is made (eg bid 5 take 5 or 6) -100 for taking 10 overtricks

Spades Strategies Players may play with different strategies: Minimize overtricks (mot) Maximize tricks (MT) Players must model opponents strategies

Experimental Setup 100 games, played to 00 points 7 cards per player Perfect information

Experimental Results Player A B A B Score %Win Score mot MT 178.2 44.0 207. mot MT 198.2 5.5 191.4 mot MT 25.4 59.0 199.2 mot MT 248.6 74.7 16.8

Results - Discussion We must use some opponent model Don t know opponents utilities Even in perfect-information games Payoffs utilities Model has large effect on quality of play

Spades Example 1 2 2 (0, 10, 10) (-0, 10, 11) (0, 10, 10) (0, 10, 10)

Maxn Deficiencies Maxn only calculates one of many equilibria Keeps no information about alternates Some alternates may be less risky in the face of uncertain opponents

Soft-Maxn Back up sets of maxn values Each time there is a tie, return both values Calculates a superset of all equilibria

Spades Example {(0, 10, 10)} 1 {(0, 10, 10), (-0, 10, 11)} 2 {(0, 10, 10)} 2 (0, 10, 10) (-0, 10, 11) (0, 10, 10) (0, 10, 10)

Soft-Maxn - Dominance Dominance relationship to compare maxn sets with respect to a given player {(10, 2, 7), (8, 7, 4)} vs: {(5, 10, 4)} strictly dominates {(8, 4, 7)} weakly dominates {(9, 1, 9)} no domination Union all sets that are not dominated

Soft-Maxn - Outcomes How large can soft-maxn sets grow? In trick-based card games n players, c cards O(c n -1) possible game outcomes In other domains we can reduce number of outcomes

Opponent Modeling Represent opponent models as a graph Nodes are outcomes in the game Directed edges represent preferences Partial order over game outcomes

Opponent Models maximize tricks minimize overtricks 6 4 5 2 Possible Outcomes 1: (0, 0, 2) 2: (0, 1, 1) : (0, 2, 0) 4: (1, 0, 1) 5: (1, 1, 0) 6: (2, 0, 0) 4 1 2 5 6 1

Opponent Modeling We do not want to assume too much about our opponents Eliminating all ties would remove all ambiguities from maxn analysis Analysis will be incorrect unless we have a perfect opponent model More or less accurate model?

Opponent Models Combine opponent models to form more generic opponent models Intersection of edges over each opponent model Builds a generic opponent model

Opponent Models maximize tricks minimize overtricks 6 4 5 2 Possible Outcomes 1: (0, 0, 2) 2: (0, 1, 1) : (0, 2, 0) 4: (1, 0, 1) 5: (1, 1, 0) 6: (2, 0, 0) 4 1 2 5 6 1

Generic Opponent Model generic model bid made 4 5 6 bid missed 1 2

Soft-Maxn Performance Run same experiments as before Use soft-maxn with generic opponent models

Experimental Results Player A A B Score %Win %Gain %Loss mot MT 241.7 68.6 15.0 6.8 mot MT 218.2 5.5 9.5 5.5 mot mot 242.2 54.8 4.8 8.0 mot mot 20.6 46.0 8.8 4.0

Learning in Soft-Maxn We observe players actions during the game Sometimes we can distinguish between models based on their moves Similar to version space learning Used player models and did inference In 900 hands, 42 (correct) inferences Identify player type in 1/6 hands

Soft-Maxn Summary It is better to under-assume than overassume about our opponents Need a bigger picture of what is happening in the game Can observe players to learn their models Only use a partial ordering of outcomes No utilities actually used

Thanks Joint work with Michael Bowling See also: ProbMaxn : Opponent Modeling in N-Player Games, Nathan Sturtevant, Michael Bowling, and Martin Zinkevich, AAAI-06.