Analyzing Games: Mixed Strategies CPSC 532A Lecture 5 September 26, 2006 Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 1
Lecture Overview Recap Mixed Strategies Fun Game Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 2
Pareto Optimality Idea: sometimes, one outcome o is at least as good for every agent as another outcome o, and there some agent who strictly prefers o to o in this case, it seems reasonable to say that o is better than o we say that o Pareto-dominates o. An outcome o is Pareto-optimal if there is no other outcome that Pareto-dominates it. Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 3
Best Response If you knew what everyone else was going to do, it would be easy to pick your own action Let a i = a 1,..., a i 1, a i+1,..., a n. now a = (a i, a i ) Best response: a i BR(a i) iff a i A i, u i (a i, a i) u i (a i, a i ) Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 4
Nash Equilibrium Now let s return to the setting where no agent knows anything about what the others will do What can we say about which actions will occur? Idea: look for stable action profiles. a = a 1,..., a n is a Nash equilibrium iff i, a i BR(a i ). Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 5
options what should you adopt, C or D? Does it depend on what you league will do? Furthermore, from the perspective of the network operaof behavior can he expect from the two users? Will any two users behave presented with this scenario? Will the behavior change if the network s the users to communicate with each other before making a decision? anges to the delays would the users decisions still be the same? How s behave if they have the opportunity to face this same decision with the art multiple times? Do answers to the above questions depend on how ents are and how they view each other s rationality? y Analyzing gives answers Games: Mixed to many Strategies of these questions. It tells us that any rational CPSC 532A Lecture 5, Slide 6 e s options are the columns. In each cell, the first number represents r, minus your delay), and the second number represents your colleague s Nash Equilibria of Example Games C D C 1, 1 4,0 D 0, 4 3, 3 Figure 3.1 The TCP user s (aka the Prisoner s) Dilemma.
e s options are the columns. In each cell, the first number represents r, minus your delay), right. and If the second players number choose the represents same side your (left colleague s or right) they have some hig otherwise they have a low utility. The game matrix is shown in Figure 3.4 Nash Equilibria of Example Games C D Left Right C 1, 1 4,0 D 0, 4 3, 3 Left 1 0 Right 0 1 Figure 3.1 The TCP user s (aka the Prisoner s) Figure Dilemma. 3.4 Coordination game. ro-sum game options what should At you the other adopt, end C or of the D? spectrum Does it depend from pure on what coordination you games lie zero league will do? Furthermore, which (bearing from in the mind perspective the comment of the we network made earlier operaof behavior can he about positive affine nstant-sum tions) expect are from more the properly two users? called Will constant-sum any two users games. behave Unlike common-pa mes presented with this scenario? Will the behavior change if the network s the users to communicate with each other c Shoham before making and Leyton-Brown, a decision? 2006 anges to the delays would the users decisions still be the same? How s behave if they have the opportunity to face this same decision with the art multiple times? Do answers to the above questions depend on how ents are and how they view each other s rationality? y Analyzing gives answers Games: Mixed to many Strategies of these questions. It tells us that any rational CPSC 532A Lecture 5, Slide 6
e s options are the columns. In each cell, the first number represents r, minus your delay), right. Rock and If the second players Papernumber choose Scissors the represents same side your (left colleague s or right) they have some hig otherwise they have a low utility. The game matrix is shown in Figure 3.4 Nash Equilibria of Example Games Rock 0 1 1 C D Left Right Paper 1 0 1 C 1, 1 4,0 Left 1 0 Scissors 1 1 0 D 0, 4 3, 3 Right 0 1 Figure 3.6 Rock, Paper, Scissors game. Figure 3.1 The TCP user s (aka the Prisoner s) Figure Dilemma. 3.4 Coordination game. B F ro-sum game options what should At you the other adopt, end C or of the D? spectrum Does it depend from pure on what coordination you games lie zero league will do? Furthermore, which (bearing from in the mind perspective the comment of the we network made earlier operaof behavior can he about positive affine nstant-sum tions) B expect are2,1 from more the properly 0,0 two users? called Will constant-sum any two users games. behave Unlike common-pa mes presented with this scenario? Will the behavior change if the network s the users to communicate F 0,0with 1,2 each other c Shoham before making and Leyton-Brown, a decision? 2006 anges to the delays would the users decisions still be the same? How s behave if they have the opportunity to face this same decision with the Figure 3.7 Battle of the Sexes game. art multiple times? Do answers to the above questions depend on how ents are and how they view each other s rationality? y Analyzing gives answers Games: Mixed to many Strategies of these questions. It tells us that any rational CPSC 532A Lecture 5, Slide 6
e s options are the columns. As in the case In each of common-payoff cell, the first number games, represents we can use an abbreviated m r, minus your delay), right. and If the second players number choose the represents same side your (left colleague s or right) they have some hig represent Rock zero-sum Paper games, Scissors in which we write only one payoff value in ea otherwise they have a low utility. The game matrix is shown in Figure 3.4 Nash Equilibria value represents of Example the payoff Games of player 1, and thus the negative of the payof Rock Note, 0 though, that 1 whereas the 1 full matrix representation is unambiguous, C D Left Right the abbreviation we must explicit state whether this matrix represents a com game or a zero-sum one. Paper C A 1 1, classical 1 example 0 4,0 of a 1 zero-sum Left game 1 is the0game of matching pen game, each of the two players has a penny, and independently chooses to d Scissors heads 1or tails. The 1 two players 0 then compare their pennies. If they are th D 0, 4 3, 3 Right 0 1 player 1 pockets both, and otherwise player 2 pockets them. The pay shown in Figure 3.5. Figure 3.6 Rock, Paper, Scissors game. Figure 3.1 The TCP user s (aka the Prisoner s) Figure Dilemma. 3.4 Coordination game. B F Heads Tails ro-sum game options what should At you the other adopt, end C or of the D? spectrum Does it depend from pure on what coordination you games lie zero league will do? Furthermore, which (bearing from in the mind perspective the comment of the we network made earlier operaof behavior can he about positive affine nstant-sum tions) B expect are2,1 from more the properly 0,0 two users? called Heads Will constant-sum 1 any two users games. 1 behave Unlike common-pa mes presented with this scenario? Will the behavior change if the network s the users to communicate F 0,0with 1,2 each other c Shoham before Tails making and Leyton-Brown, 1a decision? 1 2006 anges to the delays would the users decisions still be the same? How s behave if they have the opportunity to face this same decision with the Figure 3.7 Battle of the Sexes game. Figure 3.5 Matching Pennies game. art multiple times? Do answers to the above questions depend on how ents are and how they view each other s rationality? y Analyzing gives answers Games: Mixed to many Strategies The popular of thesechildren s questions. game It tells of Rock, us thatpaper, any rational Scissors, CPSC 532Aalso Lecture known 5, Slideas 6 R
e s options are the columns. As in the case In each of common-payoff cell, the first number games, represents we can use an abbreviated m r, minus your delay), right. and If the second players number choose the represents same side your (left colleague s or right) they have some hig represent Rock zero-sum Paper games, Scissors in which we write only one payoff value in ea otherwise they have a low utility. The game matrix is shown in Figure 3.4 Nash Equilibria value represents of Example the payoff Games of player 1, and thus the negative of the payof Rock Note, 0 though, that 1 whereas the 1 full matrix representation is unambiguous, C D Left Right the abbreviation we must explicit state whether this matrix represents a com game or a zero-sum one. Paper C A 1 1, classical 1 example 0 4,0 of a 1 zero-sum Left game 1 is the0game of matching pen game, each of the two players has a penny, and independently chooses to d Scissors heads 1or tails. The 1 two players 0 then compare their pennies. If they are th D 0, 4 3, 3 Right 0 1 player 1 pockets both, and otherwise player 2 pockets them. The pay shown in Figure 3.5. Figure 3.6 Rock, Paper, Scissors game. Figure 3.1 The TCP user s (aka the Prisoner s) Figure Dilemma. 3.4 Coordination game. B F Heads Tails ro-sum game options what should At you the other adopt, end C or of the D? spectrum Does it depend from pure on what coordination you games lie zero league will do? Furthermore, which (bearing from in the mind perspective the comment of the we network made earlier operaof behavior can he about positive affine nstant-sum tions) B expect are2,1 from more the properly 0,0 two users? called Heads Will constant-sum 1 any two users games. 1 behave Unlike common-pa mes presented with this scenario? Will the behavior change if the network s the users to communicate F 0,0with 1,2 each other c Shoham before Tails making and Leyton-Brown, 1a decision? 1 2006 anges to the delays would the users decisions still be the same? How s behave The if they have the opportunity to face this same decision with the Figure paradox 3.7 of Battle Prisoner s of the Sexes dilemma: game. Figure the3.5 Nash Matching equilibrium Pennies is game. the only art multiple times? Do answers non-pareto-optimal the above questions outcome! depend on how ents are and how they view each other s rationality? y Analyzing gives answers Games: Mixed to many Strategies The popular of thesechildren s questions. game It tells of Rock, us thatpaper, any rational Scissors, CPSC 532Aalso Lecture known 5, Slideas 6 R
Lecture Overview Recap Mixed Strategies Fun Game Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 7
Mixed Strategies It would be a pretty bad idea to play any deterministic strategy in matching pennies Idea: confuse the opponent by playing randomly Define a strategy s i for agent i as any probability distribution over the actions A i. pure strategy: only one action is played with positive probability mixed strategy: more than one action is played with positive probability these actions are called the support of the mixed strategy Let the set of all strategies for i be S i Let the set of all strategy profiles be S = S 1... S n. Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 8
Utility under Mixed Strategies What is your payoff if all the players follow mixed strategy profile s S? We can t just read this number from the game matrix anymore: we won t always end up in the same cell Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 9
Utility under Mixed Strategies What is your payoff if all the players follow mixed strategy profile s S? We can t just read this number from the game matrix anymore: we won t always end up in the same cell Instead, use the idea of expected utility from decision theory: u i (s) = a A u i (a)p r(a s) P r(a s) = j N s j (a j ) Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 9
Best Response and Nash Equilibrium Our definitions of best response and Nash equilibrium generalize from actions to strategies. Best response: s i BR(s i ) iff s i S i, u i (s i, s i) u i (s i, s i ) Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 10
Best Response and Nash Equilibrium Our definitions of best response and Nash equilibrium generalize from actions to strategies. Best response: s i BR(s i ) iff s i S i, u i (s i, s i) u i (s i, s i ) Nash equilibrium: s = s1,..., s n is a Nash equilibrium iff i, s i BR(s i ) Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 10
Best Response and Nash Equilibrium Our definitions of best response and Nash equilibrium generalize from actions to strategies. Best response: s i BR(s i ) iff s i S i, u i (s i, s i) u i (s i, s i ) Nash equilibrium: s = s1,..., s n is a Nash equilibrium iff i, s i BR(s i ) Every finite game has a Nash equilibrium! [Nash, 1950] e.g., matching pennies: both players play heads/tails 50%/50% Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 10
Rock 0 1 1 Paper 1 0 1 Computing Mixed Nash Equilibria: Battle of the Sexes Scissors 1 1 0 Figure 3.6 Rock, Paper, Scissors game. B F B 2,1 0,0 F 0,0 1,2 Figure 3.7 Battle of the Sexes game. It s hard in general to compute Nash equilibria, but it s easy when you can guess the support 3.2.2 Strategies in normal-form games We have so far defined the actions available to each player in a game, but not yet his For set ofbos, strategies, let s or his look available for an choices. equilibrium Certainly onewhere kind of strategy all actions is to select are ure strategy a single action and play it; we call such a strategy a pure strategy, and we will use part the notation of the we support have already developed for actions to represent it. There is, however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according to some probability distribution; such a strategy is called ixed strategy a mixed strategy. Although it may not be immediately obvious why a player should introduce randomness into his choice of action, in fact in a multi-agent setting the role of mixed strategies is critical. We will return to this when we discuss solution concepts Analyzing Games: for Mixed games Strategies in the next section. CPSC 532A Lecture 5, Slide 11
Scissors 1 1 0 Computing Mixed Nash Equilibria: Battle of the Sexes Figure 3.6 Rock, Paper, Scissors game. B F B 2,1 0,0 F 0,0 1,2 ure strategy ixed strategy Figure 3.7 3.2.2 Strategies in normal-form games Battle of the Sexes game. Let player 2 play B with p, F with 1 p. If player 1 best-responds with a mixed strategy, player 2 must We have so far defined the actions available to each player in a game, but not yet his make set of strategies, him indifferent or his available between choices. Certainly F andone B kind (why?) of strategy is to select a single action and play it; we call such a strategy a pure strategy, and we will use the notation we have already developed for actions to represent it. There is, however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according to some probability distribution; such a strategy is called a mixed strategy. Although it may not be immediately obvious why a player should introduce randomness into his choice of action, in fact in a multi-agent setting the role of mixed strategies is critical. We will return to this when we discuss solution concepts for games in the next section. We define a mixed strategy for a normal form game as follows. Analyzing Games: Definition Mixed Strategies 3.2.4 Let (N,(A 1,...,A n ),O,µ,u) be a normal form game, CPSC and532a for any Lecture 5, Slide 11
Scissors 1 1 0 Computing Mixed Nash Equilibria: Battle of the Sexes Figure 3.6 Rock, Paper, Scissors game. B F B 2,1 0,0 F 0,0 1,2 ure strategy ixed strategy Figure 3.7 3.2.2 Strategies in normal-form games Battle of the Sexes game. Let player 2 play B with p, F with 1 p. If player 1 best-responds with a mixed strategy, player 2 must We have so far defined the actions available to each player in a game, but not yet his make set of strategies, him indifferent or his available between choices. Certainly F andone B kind (why?) of strategy is to select a single action and play it; we call such a strategy a pure strategy, and we will use the notation we have already developed u 1 (B) for actions = u 1 to (F represent ) it. There is, however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according 2p + to some 0(1 probability p) = distribution; 0p + 1(1such p) a strategy is called a mixed strategy. Although it may not be immediately obvious why a player should introduce randomness into his choice of action, in fact in a multi-agent setting the role p = 1 of mixed strategies is critical. We will return to this 3 when we discuss solution concepts for games in the next section. We define a mixed strategy for a normal form game as follows. Analyzing Games: Definition Mixed Strategies 3.2.4 Let (N,(A 1,...,A n ),O,µ,u) be a normal form game, CPSC and532a for any Lecture 5, Slide 11
Scissors 1 1 0 Computing Mixed Nash Equilibria: Battle of the Sexes Figure 3.6 Rock, Paper, Scissors game. B F B 2,1 0,0 F 0,0 1,2 Figure 3.7 Battle of the Sexes game. Likewise, player 1 must randomize to make player 2 indifferent. 3.2.2 Strategies in normal-form games Why is player 1 willing to randomize? We have so far defined the actions available to each player in a game, but not yet his set of strategies, or his available choices. Certainly one kind of strategy is to select ure strategy a single action and play it; we call such a strategy a pure strategy, and we will use the notation we have already developed for actions to represent it. There is, however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according to some probability distribution; such a strategy is called ixed strategy a mixed strategy. Although it may not be immediately obvious why a player should introduce randomness into his choice of action, in fact in a multi-agent setting the role of mixed strategies is critical. We will return to this when we discuss solution concepts for games in the next section. We define a mixed strategy for a normal form game as follows. Definition 3.2.4 Let (N,(A 1,...,A n ),O,µ,u) be a normal form game, and for any Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 11
Scissors 1 1 0 Computing Mixed Nash Equilibria: Battle of the Sexes Figure 3.6 Rock, Paper, Scissors game. B F B 2,1 0,0 F 0,0 1,2 Figure 3.7 Battle of the Sexes game. Likewise, player 1 must randomize to make player 2 indifferent. 3.2.2 Strategies in normal-form games Why is player 1 willing to randomize? We have so far defined the actions available to each player in a game, but not yet his Let set of player strategies, 1 or play his available B with choices. q, FCertainly with 1one kind q. of strategy is to select ure strategy a single action and play it; we call such a strategy a pure strategy, and we will use the notation we have already developed u 2 (B) for = actions u 2 (F to represent ) it. There is, however, another, less obvious type of strategy; a player can choose to randomize over the set of available actions according q + to0(1 some probability q) = 0q distribution; + 2(1 such q) a strategy is called ixed strategy a mixed strategy. Although it may not be immediately obvious why a player should introduce randomness into his choice of action, q = 2 in fact in a multi-agent setting the role of mixed strategies is critical. We will return to this 3 when we discuss solution concepts for games in the next section. We define a mixed strategy for a normal form game as follows. ) are a Nash Thus the mixed strategies ( 2 3, 1 3 ), ( 1 3, 2 3 equilibrium. Definition 3.2.4 Let (N,(A 1,...,A n ),O,µ,u) be a normal form game, and for any Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 11
Interpreting Mixed Strategy Equilibria What does it mean to play a mixed strategy? Different interpretations: Randomize to confuse your opponent consider the matching pennies example Players randomize when they are uncertain about the other s action consider battle of the sexes Mixed strategies are a concise description of what might happen in repeated play: count of pure strategies in the limit Mixed strategies describe population dynamics: 2 agents chosen from a population, all having deterministic strategies. MS is the probability of getting an agent who will play one PS or another. Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 12
Lecture Overview Recap Mixed Strategies Fun Game Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 13
Fun Game! L R T 80, 40 40, 80 B 40, 80 80, 40 Play once as each player, recording the strategy you follow. Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 14
Fun Game! L R T 320, 40 40, 80 B 40, 80 80, 40 Play once as each player, recording the strategy you follow. Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 14
Fun Game! L R T 44, 40 40, 80 B 40, 80 80, 40 Play once as each player, recording the strategy you follow. Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 14
Fun Game! L R T 80, 40; 320, 40; 44, 40 40, 80 B 40, 80 80, 40 Play once as each player, recording the strategy you follow. What does row player do in equilibrium of this game? Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 14
Fun Game! L R T 80, 40; 320, 40; 44, 40 40, 80 B 40, 80 80, 40 Play once as each player, recording the strategy you follow. What does row player do in equilibrium of this game? row player randomizes 50-50 all the time that s what it takes to make column player indifferent Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 14
Fun Game! L R T 80, 40; 320, 40; 44, 40 40, 80 B 40, 80 80, 40 Play once as each player, recording the strategy you follow. What does row player do in equilibrium of this game? row player randomizes 50-50 all the time that s what it takes to make column player indifferent What happens when people play this game? Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 14
Fun Game! L R T 80, 40; 320, 40; 44, 40 40, 80 B 40, 80 80, 40 Play once as each player, recording the strategy you follow. What does row player do in equilibrium of this game? row player randomizes 50-50 all the time that s what it takes to make column player indifferent What happens when people play this game? with payoff of 320, row player goes up essentially all the time with payoff of 44, row player goes down essentially all the time Analyzing Games: Mixed Strategies CPSC 532A Lecture 5, Slide 14