Math 464: Linear Optimization and Game Haijun Li Department of Mathematics Washington State University Spring 2013
Game Theory Game theory (GT) is a theory of rational behavior of people with nonidentical (self-) interests. Common Features: 1 There is a set of at least two players (or entities); 2 all players follow a same set of rules; 3 interests of different players are different and selfish.
Game Theory Game theory can be defined as the theory of mathematical models of conflict and cooperation between intelligent, rational decision-makers. Game theory is applicable whenever at least two individuals - people, species, companies, political parties, or nations - confront situations where the outcome for each depends on the behavior of all. Game theory proposes solution concepts, defining rational outcomes of games. Solution concepts may be hard to compute...
Early History... Modern game theory began with the work of Ernst Zermelo (1913, Well-Ordering Theorem, Axiom of Choice), Émile Borel (1921, symmetric two-player zero-sum matrix game), John von Neumann (1928, two-player matrix game). The early results are summarized in the great seminal book Theory of Games and Economic Behavior of von Neumann and Oskar Morgenstern (1944).
In all GT models the basic entity is a player. Once we defined the set of players we may distinguish between two types of m Types of Games - primitives are the sets of possible actions of individual players; Non-cooperative Game: actions of individual players - Cooperative primitives are Game: the sets joint of possible actions joint of groups actions of of players groups of players. Game Theory Noncooperative GT (models of type I) Cooperative GT (models of type II) Games in Strategic Form Games in Extensive Form EFG with Perfect Information EFG with Imperfect Information
Basic ingredients: Strategic-Form Games or Games in Normal Form N = {1,..., n}, n 2, is a set of players. S i is a nonempty set of possible strategies (or pure strategies) of player i. Each player i must choose some s i S i. S = {(s 1,..., s n ) : s i S i }, the set of all possible outcomes (or pure strategy profiles). u i : S R, a utility function of player i; that is, u i (s) = payoff of player i if the outcome is s S. Definition A strategic-form game is Γ = (N, {S i }, {u i }).
John Nash Equilibrium (1950) Observe that a player s utility depends not just on his/her action, but on actions of other players. For player i, finding the best action involves deliberating about what others would do. Definition 1 All players in N are happy to find such an outcome s S such that u i (s) u i (s ), i N, s S. 2 An outcome s = (s 1,..., s n) S is a Nash equilibrium if for all i N, u i (s 1,..., s i 1, t i, s i+1,..., s n) u i (s ), t i S i.
Example: Prisoners Dilemma (RAND, 1950; Albert Tucker, 1950) Two suspects (A and B) committed a crime. Court does not have enough evidence to convict them of the crime, but can convict them of a minor offense (1 year in prison each). If one suspect confesses (acts as an informer), he walks free, and the other suspect gets 20 years. If both confess, each gets 5 years. Suspects have no way of communicating or making binding agreements.
Prisoners Dilemma: A Matrix Game
Rationality = Best Solution Suspect A s reasoning: If B stays quiet, I should confess; if B confesses, I should confess too. Suspect B does a similar thing. Unique Nash Equilibrium at (5, 5): Both confess and each gets 5 years in prison.
player has a finite set of strategies. Two-player Matrix Games = {x 1,..., x n }, S 2 = Y = {y 1,..., y m }, N = {1, 2} (x i, y j ), b ij = u 2 (x i, y j ). S 1 = {x 1,..., x n }, S 2 = {y 1,..., y m }. a ij = u 1 (x i, y j ), b ij = u 2 (x i, y j ). y 1 y m x 1 (a 11,b 11 ) (a 1m,b 1m ) x n (a n1,b n1 ) (a nm,b nm ) Figure : Row Player = player 1 Column Player = player 2
outcome for each animal is that in which it acts like a hawk whi e; the worst outcome is thatexample: in which both Hawk-Dove animals act like haw efers to Two be hawkish animals are if its fighting opponent over some is dovish prey. Each andcan dovish behaveif its o like a dove or like a hawk. The reasonable outcome for each animal is that in which it acts like a hawk while the other acts like a dove. The worst outcome is that in which both animals act like e has two Nash equilibria, (d,h) and (h,d), corresponding to tw ns about the player who yields. hawks. Each animal prefers to be hawkish if its opponent is dovish and dovish if its opponent is hawkish. The game has two Nash equilibria, (d,h) and (h,d). dove hawk dove 3,3 1,4 hawk 4,1 0,0
Example: Matching Pennies has no Each Nash of twoequilibria. people chooses either Head or Tail. If the choices differ, person 1 pays person 2 $1; if they are the same, person 2 pays person 1 $1. Each person cares only about the amount of money that he receives. The game has no Nash equilibrium. head tail head 1,-1-1,1 tail -1,1 1,-1 Figure : No Nash equilibrium
Strictly Competitive Games Definition A strategic game Γ = ({1, 2}, {S 1, S 2 }, {u 1, u 2 }) is strictly competitive if for any outcome (s 1, s 2 ) S, we have u 2 (s 1, s 2 ) = u 1 (s 1, s 2 ) (Zero-Sum). Remark 1 If u 1 (s 1, s 2 ) = gain for player 1, then u 1 (s 1, s 2 ) = loss for player 2. 2 If an outcome (s 1, s 2 ) is a Nash equilibrium, then u 1 (s 1, s 2) u 1 (s 1, s 2) u 1 (s 1, s 2 ), s 1 S 1, s 2 S 2. That is, Nash equilibrium is a saddle point.
1 Player 1 maximizes gain, whereas player 2 minimizes loss. min u 1 (s 1, y) max u 1 (x, s 2 ), s 1 S 1, s 2 S 2. y S 2 x S 1 2 In other worlds, player 1 maximizes player 2 s loss, whereas player 2 minimizes player 1 s gain. max min u 1 (x, y) min max u 1 (x, y) x S 1 y S 2 y S 2 x S 1 3 A best guaranteed outcome for player 1 would be x with min y S 2 u 1 (x, y) min y S 2 u 1 (x, y), x S 1. 4 A best guaranteed outcome for player 2 would be y with max u 1 (x, y ) max u 1 (x, y), y S 2. x S 1 x S 1 max x S 1 min y S 2 u 1 (x, y) min y S 2 u 1 (x, y) max x S 1 u 1 (x, y ) min y S 2 max x S 1 u 1 (x, y).
MiniMax Theorem (Borel, 1921; von Neumann, 1928) An outcome (s 1, s 2 ) is a Nash equilibrium in a strictly competitive game Γ = ({1, 2}, {S 1, S 2 }, {u 1, u 1 }) if and only if max min u 1 (x, y) = u 1 (s 1, s 2) = min max u 1 (x, y) =: game value, x S 1 y S 2 y S 2 x S 1 where s 1 is a best outcome for player 1 while s 2 is a best outcome for player 2.
strictly competitive strategic game admits simple and convenient tation in the matrix form. Two-player Zero-Sum Matrix Games x 1,..., x n N }, = Y {1, = 2} {y 1,..., y m }, S 1 = {x 1,..., x n }, S 2 = {y 1,..., y m }. u 1 (x i, y j ), u a ij = 2 (x u i, y 1 (x j ) = u i, y j ), a 1 (x ij = i, y u 2 (x j ) = a i, y j ). ij. a 11 a 12... a 1m min a 21 a 22... a 2m min.............. a n1 a n2... a nm min... max max... max }{{} min M max = m
Two-Player Constant-Sum Games There are two players: player 1 is called the row player and player 2 is called the column player. The row player must choose 1 of n strategies, and the column player must choose 1 of m strategies. If the row player chooses the i-th strategy and the column player chooses the j-th strategy, then the row player receives a reward of a ij and the column player receives a reward of c a ij. If c = 0, then we have a two-player zero-sum game.
Example: Completing Networks Network 1 and Network 2 are competing for an audience of 100 million viewers at certain time slot. The networks must simultaneously announce the type of show they will air in that time slot: Western, soap opera, or comedy. If network 1 has a ij million viewers, then network 2 will have 100 a ij million viewers.
Game of Odds and Evens (or Matching Pennies, again) Two players (Odd and Even) simultaneously choose the number of fingers (1 or 2) to put out. If the sum of the fingers is odd, then Odd wins $1 from Even. If the sum of the fingers is even, then Even wins $1 from Odd. This game has no saddle point.
We Need More Strategies! To analyze the games without saddle point, we introduce randomized strategies by choosing a strategy according to a probability distribution. x 1 = probability that Odd puts out one finger x 2 = probability that Odd puts out two finger y 1 = probability that Even puts out one finger y 2 = probability that Even puts out two finger where x 1 + x 2 = 1 and y 1 + y 2 = 1, x 1 0, x 2 0, y 1 0, y 2 0. Odd tosses a loaded coin (with P(Head) = x 1, P(tail) = x 2 ) to choose a strategy. Even does a similar thing. If x 1 = 1 or x 2 = 1 (y 1 = 1 or y 2 = 1), then Odd (Even) chooses a pure strategy.
Randomized Strategies Let (x 1,..., x m ) and (y 1,..., y n ) be two probability vectors (i.e., entries are all non-negative and add up to 1 for each vector). There are two players: player 1 is called the row player and player 2 is called the column player. The row player must choose 1 of m strategies, and the column player must choose 1 of n strategies. If the row player chooses the i-th strategy with probability x i and the column player chooses the j-th strategy with probability y j, then the row player receives a reward of a ij and the column player receives a reward of a ij. Given that one player chooses a strategy, how to calculate the average reward of the other player?
Odd s Optimal Strategy Odd needs to minimize his loss (or find a loss floor). If Even puts out one finger, then Odd s average reward is Odd s expected reward = ( 1)x 1 + (+1)(1 x 1 ) = 1 2x 1. If Even puts out two finger, then Odd s average reward is Odd s expected reward = (+1)x 1 + ( 1)(1 x 1 ) = 2x 1 1.
Figure : Odd s Reward
Even s Optimal Strategy Even needs to maximize his reward (or find a reward ceiling). If Odd puts out one finger, then Even s average reward is Even s expected reward = (+1)y 1 + ( 1)(1 y 1 ) = 2y 1 1. If Odd puts out two finger, then Even s average reward is Even s expected reward = ( 1)y 1 + (+1)(1 y 1 ) = 1 2y 1.
Figure : Even s Reward
Analysis Figure : Value of Game with Randomized Strategies
Value of Game with Randomized Strategies In the game of Odds and Evens, Odd s loss floor equals to Even s reward ceiling when they use the randomized strategy ( 1 2, 1 2 ). The common value of floor and ceiling is called the value of the game. The strategy that corresponds to the value of the game is called an optimal strategy. This optimal randomized strategy ( 1 2, 1 2 ) can be obtained via the duality theorem.
Randomized Strategies Γ = (N, {S i }, {u i }) is a strategic game. A randomized strategy of player i is a probability distribution P i over the set S i of its pure strategies. P i (s i ) = probability that player i chooses strategy s i S i. We assume that randomized strategies of different players are independent. Definition For any i N, the expected utility of player i given that player j, j i, chooses strategy s j S j is given by E(P i ) := s i S i u i (s 1,..., s i 1, s i, s i+1,..., s n )P i (s i ).
Randomized Strategy Nash Equilibrium Γ = (N, {S i }, {u i }) is a strategic game. A randomized strategy of player i is a probability distribution P i over the set S i of its pure strategies. P i (s i ) = probability that player i chooses strategy s i S i. We assume that randomized strategies of different players are independent. Theorem (Nash, 1950) Every finite strategic game has a randomized strategy Nash equilibrium. Remark For two-player matrix games this result was obtained by von Neumann in 1928.
Example: Stone, Paper, and Scissors The two players (row and column players) must choose 1 of three strategies: Stone, Paper, and Scissors. If both players use the same strategy, the game is a draw. Otherwise, one player wins $1 from the other according to the following rule: scissors cut paper, paper covers stone, stone breaks scissors.
Randomized Strategies x 1 = probability that row player chooses stone x 2 = probability that row player chooses paper x 3 = probability that row player chooses scissors y 1 = probability that column player chooses stone y 2 = probability that column player chooses paper y 3 = probability that column player chooses scissors where x 1 + x 2 + x 3 = 1 and y 1 + y 2 + y 3 = 1, x 1, x 2, x 3, y 1, y 2, y 3 are all non-negative. The row player chooses a randomized strategy (x 1, x 2, x 3 ). The column player chooses a randomized strategy (y 1, y 2, y 3 ).
Row Player s LP for Max. Reward v max z = v v x 2 x 3 v x 1 + x 3 v x 1 x 2 x 1 + x 2 + x 3 = 1 x 1, x 2, x 3 0, v urs.
Column Player s LP for Min. Loss w min z = w w y 2 + y 3 w y 1 y 3 w y 1 + y 2 y 1 + y 2 + y 3 = 1 y 1, y 2, y 3 0, w urs.
Dual of Row s LP = Column LP The optimal strategy for both players is ( 1 3, 1 3, 1 3 ). Figure : Dual of Row s LP = Column LP
Proof Idea of Nash s Theorem via Duality Given that the column player chooses his strategy, maximize the row player s expected reward under randomized strategy (x 1,..., x m ). Given that the row player chooses his strategy, minimize the column player s expected loss under randomized strategy (y 1,..., y n ).
Figure : Dual of Row s LP = Column LP
Γ = ({1, 2}, {S 1, S 2 }, {u 1, u 2 }) is a strategic game. A randomized strategy of player i is a probability distribution P i over the set S i of its pure strategies. E s2 (P 1 ) = expected utility of player 1 given that player 2 chooses strategy s 2 S 2. E s1 (P 2 ) = expected utility of player 2 given that player 1 chooses strategy s 1 S 1. Primal LP Dual LP max z = v v min s 2 S 2 E s2 (P 1 ), v urs. min z = w w max s 1 S 1 E s1 (P 2 ), w urs.
Duality, Again An optimal solution exists such that max P 1 min E s2 (P 1 ) = min max E s1 (P 2 ) s 2 S 2 s 1 S 1 The common value is known as the value of the game. Nash s original proof (in his thesis) used Brouwer s fixed point theorem. When Nash made this point to John von Neumann in 1949, von Neumann famously dismissed it with the words, That s trivial, you know. That s just a fixed point theorem. (Nasar, 1998) P 2
Significance of Probabilistic Methods Probabilistic methods are often used to incorporate uncertainty. In contrast, the probabilistic method is used here to enlarge the solution set so that a Nash equilibrium can be achieved using randomized strategies. Probabilistic methods are increasingly used to prove the existence of certain rare objects in mathematical constructs.