Topics in Computer Mathematics. two or more players Uncertainty (regarding the other player(s) resources and strategies)

Similar documents
Math 152: Applicable Mathematics and Computing

Math 611: Game Theory Notes Chetan Prakash 2012

Game Theory. Problem data representing the situation are constant. They do not vary with respect to time or any other basis.

Game Theory Lecturer: Ji Liu Thanks for Jerry Zhu's slides

37 Game Theory. Bebe b1 b2 b3. a Abe a a A Two-Person Zero-Sum Game

1. Simultaneous games All players move at same time. Represent with a game table. We ll stick to 2 players, generally A and B or Row and Col.

Computational aspects of two-player zero-sum games Course notes for Computational Game Theory Section 3 Fall 2010

DECISION MAKING GAME THEORY

2 person perfect information

Section Notes 6. Game Theory. Applied Math 121. Week of March 22, understand the difference between pure and mixed strategies.

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

Lecture Notes on Game Theory (QTM)

1 Deterministic Solutions

game tree complete all possible moves

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

mywbut.com Two agent games : alpha beta pruning

CMPUT 396 Tic-Tac-Toe Game

Game Tree Search. Generalizing Search Problems. Two-person Zero-Sum Games. Generalizing Search Problems. CSC384: Intro to Artificial Intelligence

Computing optimal strategy for finite two-player games. Simon Taylor

CSCI1410 Fall 2018 Assignment 2: Adversarial Search

CS188 Spring 2010 Section 3: Game Trees

Contents. MA 327/ECO 327 Introduction to Game Theory Fall 2017 Notes. 1 Wednesday, August Friday, August Monday, August 28 6

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

ARTIFICIAL INTELLIGENCE (CS 370D)

Introduction to Game Theory a Discovery Approach. Jennifer Firkins Nordstrom

ECON 2100 Principles of Microeconomics (Summer 2016) Game Theory and Oligopoly

Adversary Search. Ref: Chapter 5

ADVERSARIAL SEARCH. Chapter 5

Optimum Gain Analysis Using the Principle of Game Theory

CMU-Q Lecture 20:

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Computational Aspects of Game Theory Bertinoro Spring School Lecture 2: Examples

CSC304: Algorithmic Game Theory and Mechanism Design Fall 2016

Computing Nash Equilibrium; Maxmin

GAME THEORY MODULE 4. After completing this supplement, students will be able to: 1. Understand the principles of zero-sum, two-person games.

Exploitability and Game Theory Optimal Play in Poker

16.410/413 Principles of Autonomy and Decision Making

Game Theory two-person, zero-sum games

UPenn NETS 412: Algorithmic Game Theory Game Theory Practice. Clyde Silent Confess Silent 1, 1 10, 0 Confess 0, 10 5, 5

Math 152: Applicable Mathematics and Computing

Adversarial Search Aka Games

CHECKMATE! A Brief Introduction to Game Theory. Dan Garcia UC Berkeley. The World. Kasparov

Game Theory and Randomized Algorithms

CS188 Spring 2010 Section 3: Game Trees

CSC384: Introduction to Artificial Intelligence. Game Tree Search

Multi-player, non-zero-sum games

Chapter 15: Game Theory: The Mathematics of Competition Lesson Plan

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Game Theory

Math 464: Linear Optimization and Game

Student Name. Student ID

A Brief Introduction to Game Theory

SF2972 Game Theory Written Exam March 17, 2011

Mixed Strategies; Maxmin

Playing Games. Henry Z. Lo. June 23, We consider writing AI to play games with the following properties:

LECTURE 26: GAME THEORY 1

Homework 8 (for lectures on 10/14,10/16)

CS510 \ Lecture Ariel Stolerman

GAME THEORY Day 5. Section 7.4

Stat 155: solutions to midterm exam

Belief-based rational decisions. Sergei Artemov

Japanese. Sail North. Search Search Search Search

Overview GAME THEORY. Basic notions

Optimization of Multipurpose Reservoir Operation Using Game Theory

Sequential games. We may play the dating game as a sequential game. In this case, one player, say Connie, makes a choice before the other.

Game Theory. Vincent Kubala

EXPLORING TIC-TAC-TOE VARIANTS

CS 4700: Artificial Intelligence

2359 (i.e. 11:59:00 pm) on 4/16/18 via Blackboard

Exercises for Introduction to Game Theory SOLUTIONS

2. Extensive Form Games

Game Tree Search 1/6/17

ECON 312: Games and Strategy 1. Industrial Organization Games and Strategy

7. Suppose that at each turn a player may select one pile and remove c chips if c =1

ECO 220 Game Theory. Objectives. Agenda. Simultaneous Move Games. Be able to structure a game in normal form Be able to identify a Nash equilibrium

Game theory attempts to mathematically. capture behavior in strategic situations, or. games, in which an individual s success in

1. Introduction to Game Theory

(a) Left Right (b) Left Right. Up Up 5-4. Row Down 0-5 Row Down 1 2. (c) B1 B2 (d) B1 B2 A1 4, 2-5, 6 A1 3, 2 0, 1

Game Playing AI Class 8 Ch , 5.4.1, 5.5

2. The Extensive Form of a Game

Grade 7/8 Math Circles. February 14 th /15 th. Game Theory. If they both confess, they will both serve 5 hours of detention.

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Introduction to Auction Theory: Or How it Sometimes

Game Theory. Chapter 2 Solution Methods for Matrix Games. Instructor: Chih-Wen Chang. Chih-Wen NCKU. Game Theory, Ch2 1

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

Ian Stewart. 8 Whitefield Close Westwood Heath Coventry CV4 8GY UK

Microeconomics of Banking: Lecture 4

Adversarial search (game playing)

GAME THEORY Edition by G. David Garson and Statistical Associates Publishing Page 1

Advanced Microeconomics: Game Theory

Tiling Pools Learning Task

Chapter 30: Game Theory

4. Games and search. Lecture Artificial Intelligence (4ov / 8op)

Variations on the Two Envelopes Problem

Realizing Strategies for winning games. Senior Project Presented by Tiffany Johnson Math 498 Fall 1999

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

Dominance and Best Response. player 2

BANKROLL MANAGEMENT IN SIT AND GO POKER TOURNAMENTS

Game Theory Refresher. Muriel Niederle. February 3, A set of players (here for simplicity only 2 players, all generalized to N players).

Ar#ficial)Intelligence!!

Transcription:

Choosing a strategy Games have the following characteristics: two or more players Uncertainty (regarding the other player(s) resources and strategies) Strategy: a sequence of play(s), usually chosen to optimize the player s payoff. Payoff: The gain achieved by one of the players with the use of some strategy against the other players, depending on their strategies. A zero-sum game is one in which the sum of the payoffs for any single choice of strategies by each player is zero. In a two player zero-sum game, the payoff for one of the players is the negative of the payoff for the other player, for each choice of strategy by both players. Each player usually has several strategies from which they may choose; the players do not necessarily have the same number of strategies. For instance, if the game is one in which at any point in time one of the players is in an offensive position and the other is in a defensive position, they may have entirely different numbers and types of strategies. The situation for a particular game with a certain set of strategies and payoffs is summarized in a game matrix. In the following examples we generally use square matrices. However, game matrices need not be square, and are only square when both players have the same number of strategies available to choose from. An example of a game matrix for a game where each player has three strategies would look like this (the letters a through i are the payoffs): Player 2 Strategy 1 Strategy 2 Strategy 3 Player 1 Strategy 1 a b c Strategy 2 d e f Strategy 3 g h i By convention, in a two-player zero-sum game matrix a positive payoff represents a positive gain for Player 1 and a corresponding, equal, loss for Player 2; a negative value NTC 4/24/05 171

indicates a gain for player two and a loss for player 1. We can simplify drawing the matrix by eliminating the column and row labels, such as: 20-5 12 4-17 -10 10 17 1 Game 1 Quite often such game matrices are symmetric about the diagonal, but this is not necessary. The above matrix is a zero-sum game despite the lack of symmetry since, for instance, when player 1 uses strategy 1 and player 2 uses strategy 3 Player one will gain 12 (points, dollars, whatever) and Player 2 will gain -12 units. By the way, the hardest part of establishing the game matrix is, in fact, the determination of the payoff for each cell. In a gambling game where there are fixed odds involved and the units of the payoff is in dollars, for instance, this is quite straightforward, but in many game situations the measurement is in more intangible rewards, such as prestige or position. In these cases, the players must agree on a consistent measurement and value for each combination of strategies. 19 Once the game matrix has been established, each player can now attempt to devise an optimum overall strategy. What constitutes optimum may vary from game to game and player to player. An overall strategy is one which chooses, rationally based on some criteria, which of the strategies on the game matrix to apply in any given situation. Let s consider several overall strategies. Conservative Strategy for Player 1: In this strategy the goal is to minimize losses. Let s apply this to the matrix for Game 1 above (we will always analyze such situations from Player 1's point of view). Player 1 examines each row to find the maximum loss for each strategy. In the above example, the maximum losses are Strategy 1: -5 (if Player 2 uses their Strategy 2) Strategy 2: -17 (if Player 2 uses their Strategy 2) Strategy 3: 1 (a gain, if Player 2 uses Strategy 3) Player 1 then chooses a strategy to minimize their loss; in this case Player 1 19 was first invented by John Von Neumann, and his application was economics. NTC 4/24/05 172

chooses strategy 3 since the worst that can happen with this strategy is to gain 1 point. This strategy finds the minimum payoff (maximum loss) in each row and chooses the maximum value among those minimums to decide which strategy to use. This is called a maximin procedure. Clearly this overall strategy is intended to minimize risk (but may also minimize gain as well as loss). Conservative Strategy for Player 2: From Player 2's point of view, the overall strategy is to examine the columns and choose the minimum of maximums, or a minimax procedure. This is because all the payoff s are listed in the matrix as gains from Player 1's point of view. Thus, Player 2 finds their maximum losses looking across the columns as Strategy 1: 20 (if Player 1 uses their strategy 1) Strategy 2: 17 ( if Player 1 uses their strategy 3) Strategy 3: 12 ( if Player 1 uses Strategy 1) The minimum of these values is 12, so Player 2 should choose strategy 3 since the worst that can happen with this strategy is to lose 12 points vs 17 or 20 with the other strategies.. The value of the payoff at the intersection of Player 1's chosen strategy and Player 2's chosen strategy is known as the value of the game. In the current example, both players use their Strategy 3 which intersect at the payoff value of 1, so the value of the game is 1. NTC 4/24/05 173

Practice Problems - minimax/maximin procedures Assuming each player only plays their conservative strategy in each of the following game matrices, find the value of each game. a. -9 40 57-8 Value of the game: b. -4 5 3 1 3 2-3 -1 0 Value of the game c. -9 5 0-2 1-5 7 1 8 6-1 3 Value of the game: Mixed Strategies: Consider the above two strategies; in actual play if both players start by always using Strategy 3, Player 1 would eventually realize that if Player 2 was going to continue to use Strategy 3, Player 1 should switch to Strategy 1, since their gain will now be increased to 12 (intersection of Player 1's strategy 1 and Player 2's strategy 3. Of course, as soon as Player 2 notices that Player 1 has changed to strategy 1, they should change their strategy to strategy 2, giving them a gain of 5 (giving Player 1 a gain of -5). In actual play, each player would be continually changing their strategy as a result of the other players change in strategy. The net result is that each player plays each of their strategies a certain percentage of the time. NTC 4/24/05 174

In this situation, there is no fixed value for the game, so we must be content with an average value of the game. This is determined by averaging all the values in the matrix, weighted by the percentage of time each value is in use (or, equivalently, by the probability that a strategy will be used by that player). Let s consider a two-person game in which each player has just 2 strategies (a 2 x 2 game matrix). The relative frequency with which Player 1 uses their strategy 1 can be expressed as the probability, p 1, that strategy 1 is in use at any given time. Then the relative frequency with which strategy 2 is used by player 1 gives a probability of 1-p 1 that strategy 2 is in use. Similarly, the probabilities that Player 2 is using Strategy 1 or Strategy 2 is p 2 and 1-p 2, respectively. Let s draw and label the game matrix as follows: p 2 1-p 2 p 1 a b 1-p 1 c d The average value of the game is given by AV = ap 1 p 2 + bp 1 (1-p 2 ) + c(1-p 1 )p 2 + d(1-p 1 )(1-p 2 ) [G1] More generally, let s suppose the game matrix is an n x m matrix, G. Each value is given by G[i,j], which is the value in the location at the intersection of Player 1's strategy i ( 1 i n) and Player 2's strategy j (1 j m). Also, let p1 i be the probability that Player 1 uses strategy i and p2 j be the probability with which Player 2 uses strategy j. Then the average value of a game is n m AV = G[i,j]p1 i p2 j i=1 j=1 [G2] NTC 4/24/05 175

Practice Problems - Mixed Strategies Find the average value of each of the following games (Note that the pi s given in these examples are not necessarily the optimum choices for each player, as we will see later.) a. p 2 =.3 1-p 2 =.7 p 1 =.5 0-2 1-p 1 =,5-2 +2 Average Value b. p 2 =.3 1-p 2 =.7 p 11 =.5 5-2 p 12 =,3-2 +3 p 13 =.2 0-4 Average Value Determined Games. Some matrices, not infrequently, have the property that the maximin procedure (by Player 1) and the minimax procedure result (by Player 2) both give the value in the same cell in the game matrix. For example, in this matrix 3 0 2-4 -1 3 2-2 -1 Game 2 the maximin procedure identifies Player 1's strategy 1 as the strategy with the minimum loss (0 when Player 2 chooses strategy 2) and the minimax procedure identifies Player 2's strategy 2 as the strategy with the minimum loss (0 when Player 1 chooses strategy 1). NTC 4/24/05 176

In this situation, neither player should look for an alternate strategy. As long as Player 2 is using strategy 2, Player 1's losses can only increase if they switch to strategy 2 or 3. Similarly, as long as Player 1 uses strategy 1, Player 2's losses only increase if they move to strategy 1 or 3. The value of the game (in this example only) is 0, and this value is called a saddle point. Any game matrix with a saddle point is called a determined game; in such a game the optimum overall strategy for both players is to stick with the single strategy determined by the saddle point. In other words, mixed strategies should not be used. Practice Problems - Determined Games Determine which, if any, of the Games in the first Practice Problem Set (maximin/minimax procedures) are determined games. Optimum Mixed Strategy For two-player games with two strategies each of which are not determined, we can derive formulas which give the optimum mixed strategy for each player. That is, based on the payoffs in the cells, we can determine what p 1 (the percentage of time Player 1 should use their Strategy 1) and p 2 should be for each player. For Player 1 while for Player 2 p 1 = (c-d)/((b+c)-(a+d)) [G3.1] p 2 = (b-d)/((b+c)-(a+d)). [G3.2] [These can be derived by taking derivatives of [G1] with respect to p 1 and p 2 ] Example - given the game matrix below, establish overall strategies for both players. 4-2 -1 2 Game 3 First, we use the minimax and maximin procedures to ensure that there is no saddle point, i. e. that this is not a determined game. For Player 1 the maximum of the NTC 4/24/05 177

minimum losses is -1; for Player 2 the minimum of the maximum losses is 2. Since these are not in the same cell, this is not a determined game. We can now proceed to find the best overall mixed strategy for each player. From [G3.1] we get p 1 = (-1-2)/((-1 + -2) - (4+2) = -3/-9 = 1/3 This means Player 1 should use Strategy 1 one-third of the time and Strategy 2 twothirds of the time. Similarly, for player 2, using [G3.2] p 2 = (-2-2)/-9 = -4/-9 = 4/9 Player 2 should use their Strategy 1 four-ninths of the time and Strategy 2 five-ninths of the time. Finally, we can calculate the average value of the game using [G1]: AV = ap 1 p 2 + bp 1 (1-p 2 ) + c(1-p 1 )p 2 + d(1-p 1 )(1-p 2 ) = 4 x 1/3 x 4/9 + (-2) x 1/3 x 5/9 + (-1) x 2/3 x 4/9 + 2 x 2/3 x 5/9 = 16/27-10/27-8/27 + 20/27 = 18/27 = 2/3 Thus, on the average, Player 1 will gain 2/3 (points, dollars, whatever) and Player 2 will lose 2/3. These techniques are only applicable to situations where each player has just two strategies available to them. However, in some cases where there are more than two strategies for one or both of the players it may be possible to eliminate one or more strategies from the table. Consider the following matrix: -1 4-2 3-1 2 Game 4 The maximin procedure yields a value of -1 in strategy 2 for player 1. The minimax procedure yields a value of 2 in strategy 3 for player 2. Therefore, this is not a determined game and we would like to use the techniques just discussed to determine an optimum fixed strategy. But those techniques are only valid for a 2 x 2 matrix. However, let s consider player 2's choices: every payoff in strategy 1 is greater than the corresponding payoff in strategy 3 for this player. Clearly, regardless of which strategy player 1 might use, player 2 will never choose strategy 1. Since strategy 1 for player 2 is never an option, it can be removed from the matrix without affecting the outcome. the NTC 4/24/05 178

problem is now reduced to a 2 x 2 game (the same one as in the previous example.) Practice Problems - Optimum Mixed Strategies Determine the Optimum probabilities and the Average Values for the following games. a. b. c. -1 1 1-1 -3 6 3-8 -9-2 -1 0 d. Suppose, during the summer season, you operate a concession stand in a park with a large swimming pool, rain or shine. When it is sunny you do a good business renting beach umbrellas to patrons of the pool; when it is raining you sell quite a few rain umbrellas. You rent the beach umbrellas for $5.00 a day and on an average day you rent 50 of them. The cost of replacing and repairing beach umbrellas averages about $.60 per umbrella per day. When it rains, you manage to sell 20 rain umbrellas for $10.00 each and they cost you $7.00 (and you still need to maintain beach umbrellas). During the season, it rains about 30% of the time. In order to maximize your profit, what percentage of your stock should be beach umbrellas, and what percentage should be rain umbrellas? What is your average profit per day? NTC 4/24/05 179

Game Trees The game matrix is not the only way to model a game, and is best used for games which proceed via independent plays, such as football or poker. Other games, such as chess or tic tac toe, are better described using trees since they involve a sequence of plays, each dependent on previous plays. NTC 4/24/05 180

Answers to Practice Problems Practice Problems - minimax/maximin procedures Assuming each player only plays their conservative strategy in each of the following game matrices, find the value of each game. a. -9 40 57-8 Value of the game: = -8 b. -4 5 3 1 3 2-3 -1 0 Value of the game = 1 c. -9 5 0-2 1-5 7 1 8 6-1 3 Value of the game:= 3 NTC 4/24/05 181

Practice Problems - Mixed Strategies Find the average value of each of the following games (Note that the pi s given in these examples are not necessarily the optimum choices for each player, as we will see later.) a. p 2 =.3 1-p 2 =.7 p 1 =.5 0-2 1-p 1 =,5-2 +2 Average Value = -.3 b. p 2 =.3 1-p 2 =.7 p 11 =.5 5-2 p 12 =,3-2 +3 p 13 =.2 0-4 Practice Problems - Determined Games Average Value = -.06 Determine which, if any, of the Games in the first Practice Problem Set (maximin/minimax procedures) are determined games. b. is a determined game. Practice Problems - Optimum Mixed Strategies Determine the Optimum probabilities and the Average Values for the following games. a. -1 1 1-1 Probabilities: p 1 =.5, p 2 =.5 Avg. Val = 0 NTC 4/24/05 182

b. -3 6 3-8 Probabilities: p 1 =.55, p 2 =.7 Avg. Val =.22 c. -9-2 -1 0 This is a determined game. Value = -1 d. Suppose, during the summer season, you operate a concession stand in a park with a large swimming pool, rain or shine. When it is sunny you do a good business renting beach umbrellas to patrons of the pool; when it is raining you sell quite a few rain umbrellas. You rent the beach umbrellas for $5.00 a day and on an average day you rent 50 of them. The cost of replacing and repairing beach umbrellas averages about $.60 per umbrella per day. When it rains, you manage to sell 20 rain umbrellas for $10.00 each and they cost you $7.00 (and you still need to maintain beach umbrellas). During the season, it rains about 40% of the time. In order to maximize your profit, what percentage of your stock should be beach umbrellas, and what percentage should be rain umbrellas? What is your average profit per day? First create the game matrix. Sunny Raining Beach 50($5-$.60)= $220 50(-$1) = -$50 Rain 0 20($10-$7) = $60 This is not a determined game since the maximin value for you is 0 and the minimax value for Mother Nature is $60. Find p = (c-d)/(b+d)-(a+d) = (-50-60)/(-50+0)-(220+60) = -110/-330 =.3 NTC 4/24/05 183

Therefore, 30% of your stock should be beach umbrellas, 70% should be rain umbrellas. Your average daily profit is the average value of the game. AV =.3 x.6 x 220 +.3 x.4 x-50 +.7 x.6 x 0 +.7 x.4 x 60 = 39.6 + (-6) + 0 + 16.8 = $50.4 NTC 4/24/05 184