Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10 October 12, 2006 Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 1
Lecture Overview Recap Backward Induction Imperfect-Information Extensive-Form Games Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 2
Introduction The normal form game representation does not incorporate any notion of sequence, or time, of the actions of the players The extensive form is an alternative representation that makes the temporal structure explicit. Two variants: perfect information extensive-form games imperfect-information extensive-form games Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 3
efinition A (finite) perfect-information game (in extensive form) is a tuple G = (N, A, H, Z, χ, ρ, σ, u), where N is a set of n players A = (A 1,..., A n ) is a set of actions for each player H is a set of non-terminal choice nodes Z is a set of terminal nodes, disjoint from H χ : H 2 A is the action function assigns to each choice node a set of possible actions ρ : H N is the player function assigns to each non-terminal node a player i N who chooses an action at that node Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 4
efinition A (finite) perfect-information game (in extensive form) is a tuple G = (N, A, H, Z, χ, ρ, σ, u), where χ : H 2 A is the action function assigns to each choice node a set of possible actions ρ : H N is the player function assigns to each non-terminal node a player i N who chooses an action at that node σ : H A H Z is the successor function maps a choice node and an action to a new choice node or terminal node such that for all h 1, h 2 H and a 1, a 2 A, if σ(h 1, a 1 ) = σ(h 2, a 2 ) then h 1 = h 2 and a 1 = a 2 u = (u 1,..., u n ), where u i : Z R is a utility function for player i on the terminal nodes Z Note: the choice nodes form a tree, so we can identify a node with its history. Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 4
Pure Strategies Overall, a pure strategy for a player in a perfect-information game is a complete specification of which deterministic action to take at every node belonging to that player. efinition Let G = (N, A, H, Z, χ, ρ, σ, u) be a perfect-information extensive-form game. Then the pure strategies of player i consist of the cross product χ(h) h H,ρ(h)=i Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 5
Nash Equilibria Given our new definition of pure strategy, we are able to reuse our old definitions of: mixed strategies best response Nash equilibrium Theorem Every perfect information game in extensive form has a PSNE This is easy to see, since the players move sequentially. Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 6
0) yes Recap no yes no Backward Induction yes Imperfect-Information Extensive-Form Games (2,0) (0,0) (1,1) (0,0) (0,2) Induced Normal Form Figure 5.1 The Sharing game. t the definition contains a subtlety. An agent s strategy requires a decision ce node, regardless of whether or not it is possible to reach that node given oice nodes. In the Sharing game above the situation is straightforward three pure strategies, and player 2 has eight (why?). But now consider the in Figure 5.2. In fact, the connection to the normal form is even tighter we can convert an extensive-form game into normal form 2 C E F 1 (3,8) A (8,3) 1 B (5,5) 2 G H (2,10) (1,0) CE CF E F AG 3, 8 3, 8 8, 3 8, 3 AH 3, 8 3, 8 8, 3 8, 3 BG 5, 5 2, 10 5, 5 2, 10 BH 5, 5 1, 0 5, 5 1, 0 Figure 5.2 A perfect-information game in extensive form. define a complete strategy for this game, each of the players must choose each of his two choice nodes. Thus we can enumerate the pure strategies s as follows.,g),(a,h),(b,g),(b,h)} Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 7
Subgame Perfection efine subgame of G rooted at h: the restriction of G to the descendents of H. efine set of subgames of G: subgames of G rooted at nodes in G s is a subgame perfect equilibrium of G iff for any subgame G of G, the restriction of s to G is a Nash equilibrium of G Notes: since G is its own subgame, every SPE is a NE. this definition rules out non-credible threats Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 8
Lecture Overview Recap Backward Induction Imperfect-Information Extensive-Form Games Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 9
Centipede Game 5 Reasoning and Computing with the Extensive Form 1 A 2 A 1 A 2 A 1 A (3,5) (1,0) (0,2) (3,1) (2,4) (4,3) Figure 5.9 Play this as a fun game... The centipede game place. In other words, you have reached a state to which your analysis has given a probability of zero. How should you amend your beliefs and course of action based on this measure-zero event? It turns out this seemingly small inconvenience actually raises a fundamental problem in game theory. We will not develop the subject further here, but let us only mention that there exist different accounts of this situation, and they depend on the probabilistic assumptions made, on what is common knowledge (in Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 10
Computing Subgame Perfect Equilibria Idea: Identify the equilibria in the bottom-most trees, and adopt these as one moves up the tree function BackwardInduction (node h) returns u(h) if h Z then return u(h) {h is a terminal node} best util for all a χ(h) do {all actions available at node h} util at child BackwardInduction(σ(h, a)) if util at child ρ(h) > best util ρ(h) then best util util at child end if end for return best util util at child is a vector denoting the utility for each player the procedure doesn t return an equilibrium strategy, but rather labels each node with a vector of real numbers. This labeling can be seen as an extension of the game s utility function to the non-terminal nodes The equilibrium strategies: take the best action at each node. Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 11
Backward Induction 5 Reasoning and Computing with the Extensive Form 1 A 2 A 1 A 2 A 1 A (3,5) (1,0) (0,2) (3,1) (2,4) (4,3) Figure 5.9 The centipede game What happens when we use this procedure on Centipede? In the only equilibrium, player 1 goes down in the first move. place. In other words, you have reached a state to which your analysis has given a probability However, of zero. How thisshould outcome you amend is Pareto-dominated your beliefs and courseby ofall action but based one on this measure-zero other outcome. event? It turns out this seemingly small inconvenience actually raises Twoa fundamental considerations: problem in game theory. We will not develop the subject further here, but let us only mention that there exist different accounts of this situation, and they depend practical: on the probabilistic human subjects assumptions don t made, on go what down is common right knowledge away (in particular, theoretical: whether there what is common should knowledge you of do rationality), as player and 2 on if exactly player how 1 one doesn t revises one s go down? beliefs in the face of measure zero events. The last question is intimately related to the subject SPE of analysis belief revision says to discussed go down. in Chapter However, 2. that same analysis says that P1 would already have gone down. How do you 5.2 Imperfect-information update yourextensive-form beliefs upon observation games of a measure zero event? Up to this point, but in our if player discussion 1 knows of extensive-form that you ll games dowe something have allowedelse, players it is to specify the action rational that they for would himtake notat to every gochoice downnode anymore... of the game. a paradox This implies that players know there s node a whole they are literature in, and recalling on thisthat question in such games we equate Extensive Formnodes Games: with Backward the histories Induction thatand ledimperfect to them all Information the prior Games choices, includingcpsc those532a of other Lecture 10, Slide 12
Lecture Overview Recap Backward Induction Imperfect-Information Extensive-Form Games Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 13
Intro Up to this point, in our discussion of extensive-form games we have allowed players to specify the action that they would take at every choice node of the game. This implies that players know the node they are in and all the prior choices, including those of other agents. We may want to model agents needing to act with partial or no knowledge of the actions taken by others, or even themselves. This is possible using imperfect information extensive-form games. each player s choice nodes are partitioned into information sets if two choice nodes are in the same information set then the agent cannot distinguish between them. Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 14
Formal definition efinition An imperfect-information games (in extensive form) is a tuple (N, A, H, Z, χ, ρ, σ, u, I), where (N, A, H, Z, χ, ρ, σ, u) is a perfect-information extensive-form game, and I = (I 1,..., I n ), where I i = (I i,1,..., I i,ki ) is an equivalence relation on (that is, a partition of) {h H : ρ(h) = i} with the property that χ(h) = χ(h ) and ρ(h) = ρ(h ) whenever there exists a j for which h Ii, j and h Ii, j. Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 15
layer Recap would be able tobackward distinguish Inductionthe nodes). Thus, Imperfect-Information if I I i is an Extensive-Form equivalence Games clas e can unambiguously use the notation χ(i) to denote the set of actions available layer Example i at any node in information set I. 1 L R 2 2 (1,1) A B 1 l r l r (0,0) (2,4) (2,4) (0,0) Figure 5.10 An imperfect-information game. What are the equivalence classes for each player? What are the pure strategies for each player? Consider the imperfect-information extensive-form game shown in Figure 5.10. I his game, player 1 has two information sets: the set including the top choice node, an he set including the bottom choice nodes. Note that the two bottom choice nodes he second information set have the same set of possible actions. We can regard play as Extensive not knowing Form Games: whether Backward Induction player and2imperfect choseinformation A or BGames when she makes CPSC 532A herlecture choice 10, Slide betwee 16
layer Recap would be able tobackward distinguish Inductionthe nodes). Thus, Imperfect-Information if I I i is an Extensive-Form equivalence Games clas e can unambiguously use the notation χ(i) to denote the set of actions available layer Example i at any node in information set I. 1 L R 2 2 (1,1) A B 1 l r l r (0,0) (2,4) (2,4) (0,0) Figure 5.10 An imperfect-information game. What are the equivalence classes for each player? What are the pure strategies for each player? Consider the imperfect-information extensive-form game shown in Figure 5.10. I choice of an action in each equivalence class. his game, player 1 has two information sets: the set including the top choice node, an he set including Formally, the bottom the pure choice strategies nodes. of Note player that i the consist two of bottom the cross choice nodes he second information product Ii,j set I have i χ(i the i,j ). same set of possible actions. We can regard play as Extensive not knowing Form Games: whether Backward Induction player and2imperfect choseinformation A or BGames when she makes CPSC 532A herlecture choice 10, Slide betwee 16
Normal-form games 5 Reasoning and Computing with the Extensive We can represent any normal form game. 1 C 2 c d c d (-1,-1) (-4,0) (0,-4) (-3,-3) Figure Note 5.11 that it The would Prisoner s also be ilemma the same game if we in put extensive player 2 form. at the root node. ecall that perfect-information games were not expressive enough to captu soner s Extensive Form ilemma Games: Backward gameinduction and many and Imperfect other Information ones. Games In contrast, CPSC as is 532A obvious Lecture 10, from Slide 17 th
Induced Normal Form Same as before: enumerate pure strategies for all agents Mixed strategies are just mixtures over the pure strategies as before. Nash equilibria are also preserved. Note that we ve now defined both mapping from NF games to IIEF and a mapping from IIEF to NF. what happens if we apply each mapping in turn? we might not end up with the same game, but we do get one with the same strategy spaces and equilibria. Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 18
Randomized Strategies It turns out there are two meaningfully different kinds of randomized strategies in imperfect information extensive form games mixed strategies behavioral strategies Mixed strategy: randomize over pure strategies Behavioral strategy: independent coin toss every time an information set is encountered Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 19
Figure 5.1 The Sharing game. Recap Backward Induction Imperfect-Information Extensive-Form Games Randomized strategies example Notice that the definition contains a subtlety. An agent s strategy requires a decision at each choice node, regardless of whether or not it is possible to reach that node given the other choice nodes. In the Sharing game above the situation is straightforward player 1 has three pure strategies, and player 2 has eight (why?). But now consider the game shown in Figure 5.2. 2 A 1 B 2 C E F 1 (3,8) (8,3) (5,5) G H (2,10) (1,0) Figure 5.2 A perfect-information game in extensive form. In order to define a complete strategy for this game, each of the players must choose an action at each of his two choice nodes. Thus we can enumerate the pure strategies of the players as follows. Give an example of a behavioral strategy: S 1 = {(A,G),(A,H),(B,G),(B,H)} S 2 = {(C,E),(C,F),(,E),(,F)} It is important to note that we have to include the strategies (A,G) and (A,H), even though once A is chosen the G-versus-H choice is moot. The definition of best response and Nash equilibria in this game are exactly as they are in for normal form games. Indeed, this example illustrates how every perfectinformation game can be converted to an equivalent normal form game. For example, the perfect-information game of Figure 5.2 can be converted into the normal form image of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are Multi Agent Systems, draft of September 19, 2006 Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 20
Figure 5.1 The Sharing game. Recap Backward Induction Imperfect-Information Extensive-Form Games Randomized strategies example Notice that the definition contains a subtlety. An agent s strategy requires a decision at each choice node, regardless of whether or not it is possible to reach that node given the other choice nodes. In the Sharing game above the situation is straightforward player 1 has three pure strategies, and player 2 has eight (why?). But now consider the game shown in Figure 5.2. 2 A 1 B 2 C E F 1 (3,8) (8,3) (5,5) G H (2,10) (1,0) Figure 5.2 A perfect-information game in extensive form. In order to define a complete strategy for this game, each of the players must choose an action at each of his two choice nodes. Thus we can enumerate the pure strategies of the players as follows. Give an example of a behavioral strategy: A with probability.5 and G with probability.3 S 1 = {(A,G),(A,H),(B,G),(B,H)} Give an Sexample 2 = {(C,E),(C,F),(,E),(,F)} of a mixed strategy that is not a behavioral strategy: It is important to note that we have to include the strategies (A,G) and (A,H), even though once A is chosen the G-versus-H choice is moot. The definition of best response and Nash equilibria in this game are exactly as they are in for normal form games. Indeed, this example illustrates how every perfectinformation game can be converted to an equivalent normal form game. For example, the perfect-information game of Figure 5.2 can be converted into the normal form image of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are Multi Agent Systems, draft of September 19, 2006 Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 20
Figure 5.1 The Sharing game. Recap Backward Induction Imperfect-Information Extensive-Form Games Randomized strategies example Notice that the definition contains a subtlety. An agent s strategy requires a decision at each choice node, regardless of whether or not it is possible to reach that node given the other choice nodes. In the Sharing game above the situation is straightforward player 1 has three pure strategies, and player 2 has eight (why?). But now consider the game shown in Figure 5.2. 2 A 1 B 2 C E F 1 (3,8) (8,3) (5,5) G H (2,10) (1,0) Figure 5.2 A perfect-information game in extensive form. In order to define a complete strategy for this game, each of the players must choose an action at each of his two choice nodes. Thus we can enumerate the pure strategies of the players as follows. Give an example of a behavioral strategy: A with probability.5 and G with probability.3 S 1 = {(A,G),(A,H),(B,G),(B,H)} Give an Sexample 2 = {(C,E),(C,F),(,E),(,F)} of a mixed strategy that is not a behavioral strategy: It is important to note that we have to include the strategies (A,G) and (A,H), even though once A is chosen the G-versus-H choice is moot. (.6(A, The definition G),.4(B, of best response H)) and (why Nash equilibria not?) in this game are exactly as they are in for normal form games. Indeed, this example illustrates how every perfectinformation game can be converted to an equivalent normal form game. For example, the perfect-information game of Figure 5.2 can be converted into the normal form image of the game, shown in Figure 5.3. Clearly, the strategy spaces of the two games are In this game every behavioral strategy corresponds to a mixed strategy... Multi Agent Systems, draft of September 19, 2006 Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 20
Games of imperfect recall 5.2 Imperfect-information extensive-form games 121 L 1 L R 2 U R 1,0 100,100 5,1 2,2 Figure 5.12 A game with imperfect recall What is the space of pure strategies in this game? librium. Note in particular that in a mixed strategy, agent 1 decides probabilistically whether to play L or R in his information set, but once he decides he plays that pure strategy consistently. Thus the payoff of 100 is irrelevant in the context of mixed strategies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh each time he finds himself in the information set. Noting that the pure strategy is weakly dominant for agent 2 (and in fact is the unique best response to all strategies of agent 1 other than the pure strategy L), agent 1 computes the best response to as follows. If he uses the behavioral strategy (p,1 p) (that is, choosing L with probability p each time he finds himself in the information set), his expected payoff is 1 p 2 + 100 p(1 p) + 2 (1 p) The expression simplifies to 99p 2 + 98p + 2, whose maximum is obtained at p = 98/198. Thus (R,) = ((0,1),(0,1)) is no longer an equilibrium in behavioral strategies, and instead we get the equilibrium ((98/198,100/198),(0,1)). There is, however, a broad class of imperfect-information games in which the expressive power of mixed and behavioral strategies coincides. This is the class of games Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 21 of perfect recall. Intuitively speaking, in these games no player forgets any information
Games of imperfect recall 5.2 Imperfect-information extensive-form games 121 L 1 L R 2 U R 1,0 100,100 5,1 2,2 Figure 5.12 A game with imperfect recall What is the space of pure strategies in this game? librium. Note in particular that in a mixed strategy, agent 1 decides probabilistically whether to play L or R in his information set, but once he decides he plays that pure strategy consistently. Thus the payoff of 100 is irrelevant in the context of mixed strategies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh each time he finds himself in the information set. Noting that the pure strategy is weakly dominant for agent 2 (and in fact is the unique best response to all strategies of agent 1 other than the pure strategy L), agent 1 computes the best response to as follows. If he uses the behavioral strategy (p,1 p) (that is, choosing L with probability p each time he finds himself in the information set), his expected payoff is 1: (L, R); 2: (U, ) 1 p 2 + 100 p(1 p) + 2 (1 p) The expression simplifies to 99p 2 + 98p + 2, whose maximum is obtained at p = 98/198. Thus (R,) = ((0,1),(0,1)) is no longer an equilibrium in behavioral strategies, and instead we get the equilibrium ((98/198,100/198),(0,1)). There is, however, a broad class of imperfect-information games in which the expressive power of mixed and behavioral strategies coincides. This is the class of games Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 21 of perfect recall. Intuitively speaking, in these games no player forgets any information
Games of imperfect recall 5.2 Imperfect-information extensive-form games 121 L 1 L R 2 U R 1,0 100,100 5,1 2,2 Figure 5.12 A game with imperfect recall What is the space of pure strategies in this game? librium. Note in particular that in a mixed strategy, agent 1 decides probabilistically whether to play L or R in his information set, but once he decides he plays that pure strategy consistently. Thus the payoff of 100 is irrelevant in the context of mixed strategies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh each time he finds himself in the information set. Noting that the pure strategy is weakly dominant for agent 2 (and in fact is the unique best response to all strategies of agent 1 other than the pure strategy L), agent 1 computes the best response to as follows. If he uses the behavioral strategy (p,1 p) (that is, choosing L with probability p each time he finds himself in the information set), his expected payoff is 1: (L, R); 2: (U, ) What is the mixed strategy equilibrium? 1 p 2 + 100 p(1 p) + 2 (1 p) The expression simplifies to 99p 2 + 98p + 2, whose maximum is obtained at p = 98/198. Thus (R,) = ((0,1),(0,1)) is no longer an equilibrium in behavioral strategies, and instead we get the equilibrium ((98/198,100/198),(0,1)). There is, however, a broad class of imperfect-information games in which the expressive power of mixed and behavioral strategies coincides. This is the class of games Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 21 of perfect recall. Intuitively speaking, in these games no player forgets any information
Games of imperfect recall 5.2 Imperfect-information extensive-form games 121 L 1 L R 2 U R 1,0 100,100 5,1 2,2 Figure 5.12 A game with imperfect recall What is the space of pure strategies in this game? librium. Note in particular that in a mixed strategy, agent 1 decides probabilistically whether to play L or R in his information set, but once he decides he plays that pure strategy consistently. Thus the payoff of 100 is irrelevant in the context of mixed strategies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh each time he finds himself in the information set. Noting that the pure strategy is weakly dominant for agent 2 (and in fact is the unique best response to all strategies of agent 1 other than the pure strategy L), agent 1 computes the best response to as follows. If he uses the behavioral strategy (p,1 p) (that is, choosing L with probability p each time he finds himself in the information set), his expected payoff is 1: (L, R); 2: (U, ) What is the mixed strategy equilibrium? Observe that is dominant for 2. R, is better for 1 than L,, so R, is an equilibrium. 1 p 2 + 100 p(1 p) + 2 (1 p) The expression simplifies to 99p 2 + 98p + 2, whose maximum is obtained at p = 98/198. Thus (R,) = ((0,1),(0,1)) is no longer an equilibrium in behavioral strategies, and instead we get the equilibrium ((98/198,100/198),(0,1)). There is, however, a broad class of imperfect-information games in which the expressive power of mixed and behavioral strategies coincides. This is the class of games Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 21 of perfect recall. Intuitively speaking, in these games no player forgets any information
Games of imperfect recall 5.2 Imperfect-information extensive-form games 121 L 1 L R 2 U R 1,0 100,100 5,1 2,2 Figure 5.12 A game with imperfect recall What is the space of pure strategies in this game? librium. Note in particular that in a mixed strategy, agent 1 decides probabilistically whether to play L or R in his information set, but once he decides he plays that pure strategy consistently. Thus the payoff of 100 is irrelevant in the context of mixed strategies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh each time he finds himself in the information set. Noting that the pure strategy is weakly dominant for agent 2 (and in fact is the unique best response to all strategies of agent 1 other than the pure strategy L), agent 1 computes the best response to as follows. If he uses the behavioral strategy (p,1 p) (that is, choosing L with probability p each time he finds himself in the information set), his expected payoff is 1: (L, R); 2: (U, ) What is the mixed strategy equilibrium? Observe that is dominant for 2. R, is better for 1 than L,, so R, is an equilibrium. 1 p 2 + 100 p(1 p) + 2 (1 p) The expression simplifies to 99p 2 + 98p + 2, whose maximum is obtained at p = 98/198. Thus (R,) = ((0,1),(0,1)) is no longer an equilibrium in behavioral strategies, and instead we get the equilibrium ((98/198,100/198),(0,1)). There is, however, a broad class of imperfect-information games in which the expressive power of mixed and behavioral strategies coincides. This is the class of games Extensive Form Games: Backward Induction and Imperfect Information Games CPSC 532A Lecture 10, Slide 21 of perfect recall. Intuitively speaking, in these games no player forgets any information
Games of imperfect recall 5.2 Imperfect-information extensive-form games 121 L 1 L R 2 U R 1,0 100,100 5,1 2,2 Figure 5.12 A game with imperfect recall What is an equilibrium in behavioral strategies? librium. Note in particular that in a mixed strategy, agent 1 decides probabilistically whether to play L or R in his information set, but once he decides he plays that pure strategy consistently. Thus the payoff of 100 is irrelevant in the context of mixed strategies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh each time he finds himself in the information set. Noting that the pure strategy is weakly dominant for agent 2 (and in fact is the unique best response to all strategies of agent 1 other than the pure strategy L), agent 1 computes the best response to as follows. If he uses the behavioral strategy (p,1 p) (that is, choosing L with probability p each time he finds himself in the information set), his expected payoff is 1 p 2 + 100 p(1 p) + 2 (1 p) The expression simplifies to 99p 2 + 98p + 2, whose maximum is obtained at p = 98/198. Thus (R,) = ((0,1),(0,1)) is no longer an equilibrium in behavioral strategies, and instead we get the equilibrium ((98/198,100/198),(0,1)). There is, however, a broad class of imperfect-information games in which the expressive power of mixed and behavioral strategies coincides. This is the class of games of perfect recall. Intuitively speaking, in these games no player forgets any information he knew about moves made so far; in particular, he remembers precisely all his own Extensive Form Games: Backward moves. Induction Formally: and Imperfect Information Games CPSC 532A Lecture 10, Slide 22
Games of imperfect recall 5.2 Imperfect-information extensive-form games 121 L 1 L R 2 U R 1,0 100,100 5,1 2,2 Figure 5.12 A game with imperfect recall What is an equilibrium in behavioral strategies? librium. Note in particular that in a mixed strategy, agent 1 decides probabilistically whether to play L or R in his information set, but once he decides he plays that pure again, strategy strongly consistently. Thus dominant the payoff of 100 isfor irrelevant 2 in the context of mixed strategies. On other hand, with behavioral strategies agent 1 gets to randomize afresh if 1 uses each time the he finds behavioural himself in the information strategy set. Noting (p, that 1 the pure p), strategy his is expected weakly dominant for agent 2 (and in fact is the unique best response to all strategies of utility is 1 p2 + 100 p(1 p) + 2 (1 p) agent 1 other than the pure strategy L), agent 1 computes the best response to as follows. If he uses the behavioral strategy (p,1 p) (that is, choosing L with probability simplifies p each time to he 99p 2 + 98p + 2 finds himself in the information set), his expected payoff is maximum at p = 1 98/198 p 2 + 100 p(1 p) + 2 (1 p) thus equilibrium is (98/198, 100/198), (0, 1) The expression simplifies to 99p 2 + 98p + 2, whose maximum is obtained at p = 98/198. Thus (R,) = ((0,1),(0,1)) is no longer an equilibrium in behavioral strategies, and instead we get the equilibrium ((98/198,100/198),(0,1)). Thus, we can have behavioral strategies that are different There is, however, a broad class of imperfect-information games in which the expressive power of mixed and behavioral strategies coincides. This is the class of games from mixed of perfect strategies. recall. Intuitively speaking, in these games no player forgets any information he knew about moves made so far; in particular, he remembers precisely all his own Extensive Form Games: Backward moves. Induction Formally: and Imperfect Information Games CPSC 532A Lecture 10, Slide 22