Lect 15:Game Theory: the math of competition

Lect 15:Game Theory: the math of competition onflict characterized human history. It arises whenever 2 or more individuals, with different values or goals, compete to try to control the course of events. GT uses mathematical tools, called games, to study situations that involve both conflict and cooperation. The landmark publication is the 1944 book, Theory of Games and conomic Behavior, by John von Neumann & Oskar Morgenstern. The players in a game -- people, organizations, countries, -- choose from a list of options available to them, called strategies. The strategies lead to outcomes, which describe the consequences of their choices. We assume that the players have preferences for the outcomes: they like some more than others. GT analyzes the rational choice of strategies, ie, how players select strategies to obtain preferred outcomes. The object in GT is to win the game, ie, to maximize gains, or to minimize losses. When decision-making is individual, this is a problem in psychology, statistics, or other disciplines. GT analyzes situations with at least 2 players, who may feel themselves in conflict because of different goals or objectives. The outcome depends on the choices of all the players. Thus, decision-making is collective, though the players do not necessarily cooperate. Indeed, many strategy choices are noncooperative, such as those between combatants in warfare or in sports. The adversaries objectives may be at cross-purposes: a gain for one means a loss for the other. In many activities, esp in economics and politics, there may be joint gains that can be realized from cooperation. Some applications are in bargaining tactics in labor-management disputes, resource-allocation decisions in political campaigns, military choices in international crises, and the use of threats by animals in habitat acquisition and protection. We start with two-person games of total conflict, in which one player wins, the other loses, so cooperation never benefits the players. There are two types here: pure and mixed strategies. Then, we analyze games of partial conflict: prisoner s dilemma and chicken, where the players can benefit by cooperation but may have strong incentives not to cooperate. We turn next to large games, with 3 or more people involved, in which we show how to eliminate some undesirable strategies. I. Two-person total conflict games A. Pure strategies B. Mixed strategies II. Partial conflict A. Prisoner s dilemma B. hicken III. Larger games IV. Some applications x1. Henry and Lisa plan to locate a new restaurant at a busy intersection in the nearby mountains. They agree on all aspects of the business, except one. Lisa likes low elevations, Henry wants heights the higher, the better. Their preferences are diametrically opposed: What is better for Henry is worse for Lisa, what is good for Lisa is bad for Henry. Layout: Fig 15.1/p470, Table 15.1 (Text). H locates a site along the W roads, L along the NS highways. B/c their choices are made simultaneously, neither one can predict what the other will do. H tries to determine the highest altitude along the 3 routes. For each choice of route, this means considering the worst case (lowest) elevation: 4, 5, 2 the row minima (Table 5.2). He picks the highest of these 5. By choosing route B, H guarantees himself at least 5000 ft (maximin). L likewise does a worst-case analysis, lists the highest for her, the worst elevations for each highway: 10, 5, 9 the column maxima. From her POV, the best of these outcomes is 5. Thus, picking the Interstate Highway 2, she is assured of an elevation no more than 5000 ft (minimax). 1

When the maximin and minimax are equal, these are called a saddle-point. In total-conflict games, the value of the game is the best outcome that both players can guarantee. In this example, 5 is the saddle-point of the game, the guaranteed outcome if the players choose their maximin and minimax strategies. x2. The payoff matrix for a 2-person game is given as: 1 2 (a) What are the payoffs to R if R 1 is selected? R 1 14-3 (b) to if 2 is selected? R R 2-6 -5 (c) What is the payoff if R selects R 2 and selects 1? (d) Does this game have a saddle point? Which player is favored? Solution: (a) If R 1 is selected, R receives 14 if selects 1, and R pays 3 if selects 2. (b) If 2 is selected, receives 3 or 5, if R selects R 1 or R, respectively. (c) receives 6, and R pays 6. [Zero-sum games are those where the gain of one player means the loss of another.] (d) This game has a saddle-point: -3; it favors player since the value is -. [A game favors player R if the value is +; a game is fair if the value is 0.] x.3: The game has saddle-point 6; this is also the 4-9 value of the game. R plays row2 and 6 8 plays column1; these are fixed (or pure) strategies, strictly determined games. x.4: Some games have more than one saddle-points: 1 2 1 The saddle-point is 1; when there 1 5 1 are more than one saddle-points, 0-7 -1 they are all equal. x.5: All zero-sum games can be recast to show zero value by adding a proper constant to all entries in the matrix. Suppose R & play 1 or 2 fingers. They agree to show 1 or 2 fingers simultaneously, and pays an amount equal to the number of fingers shown less $3. Find the optimal strategies for each player, and the value of the game. 1 2 1 2 1 2-3 3-3 1-1 0 R 2 3-3 4-3 2 0 1 The saddle-point is 0, which is also the value of the game. This occurs at row2, column1. This is a fair game since the value is 0 (favors neither player). 2

x.6: The exception to the rule on games favoring one player or another is when the payoffs are percentages. $200 $250 Stores R and share the same $200 55% 70% Home-Theater market, as follows. R $225 40% 55% (a) The saddle-point is 55%, at row1, column1. (b) The optimal strategy for R is to play row1; for to play column1. (c) The value of the game is also 55%. R has the advantage. [If value = 50%, no one is favored; if < 50%, then is favored.] *What to do when there is no saddle-point? The game still has a value, but this is now the expectation value of both players after assigning probabilities in choosing their strategies. ==> IB. Mixed strategies. x.7: (x3/p474) A baseball pitcher and batter battle wits by outguessing each other s moves. The batting averages of B are known: 0.300 if B guesses F & P throws F; 0.200 if B guesses F & P throws ; 0.100 if B & P F; 0.500 if B & P. The game has no saddle-point. They use probabilities to guess when they should use a mixed strategy (Table 15.8) P F F = the expectation F 0.300 0.200 q value that P throws a B 0.100 0.500 1-q fastball. p 1-p F F 0.3p 0.2(1 0.1p 0.5(1 p) 0.1p 0.2 p) 0.4 p 0.5 0.3q 0.1(1 q) 0.2q 0.1 0.2q 0.5(1 q) 0.3q 0.5 (p,1-p) = (q,1-q) = 3 2, 5 5 4 1, 5 5 This means, in a purely random way, P must throw a fast ball 3/5 of the time, and a curve ball 2/5 of the time; similarly, B must expect to hit a fast ball 4/5 of the time, and a curve ball 1/5 of the time. Both strategies will get them close to an expected value of F = = = 0.260; ie, P using this strategy will ensure that B does not score higher than 0.260; B using this strategy will ensure that he makes no lower than 0.260. x.8: xer 11.. II. Partial-conflict games. Games of partial conflict are variable-sum games, where the payoffs to the players vary at the different outcomes. 3

There is some mutual gain to be realized by both players if they cooperate, but this may be difficult to do in the absence of either good communication or trust. When these are lacking, players are less likely to comply w/ any agreement that is made. Noncooperative games are games in w/c a binding agreement cannot be enforced. ven if communication is allowed in such games, there is no assurance that a player can trust an opponent to choose a particular strategy that s/he promises to select. Often, the players self-interests lead them to make strategy choices that yield both lower payoffs than they could have achieved by cooperating. A. Prisoners dilemma Named by Princeton mathematician Albert W Tucker, 1950, models the forces at work behind the arms race, price wars, the population problem. The players can do better by cooperating, but there is no compelling reason to do so unless the players have credible threats of retaliation for not cooperating. 2 people are accused of a crime and held incommunicado. ach has 2 choices: to maintain his/her innocence, or to sign a confession accusing the partner of committing the crime. It is in each other s interest to confess and implicate the partner, thereby receiving a reduced sentence. Yet, if both suspects confess, they ensure a bad outcome viz, they are both found guilty. What is good for the pair is to deny having committed the crime, leaving the state w/ insufficient evidence to convict them. But, this is frustrated by their pursuit of their own individual rewards. x.9: Arms race In int l relations, let the antagonistic countries be Red & Blue. ach can independently select 1 of 2 policies: A: arm in preparation for a possible war (noncooperation) D: disarm [desist], or at least negotiate an arms-control agreement (cooperation). There are 4 possible outcomes: (D,D) R & B disarm; next best for both b/c, while advantageous to each, it entails certain risks; (A,A) R & B arm; next worst for both b/c they spend a lot on arms and are not much better off than at (D,D); (A,D) R arms & B disarms; best for R, worst for B b/c R has decided advantage over B; (D,A) R disarms & B arms; worst for R, best for B b/c B has decided advantage over R. B ach entry means the 1 st # is A D the payoff to R, the 2 nd to B; A (2,2) (4,1) these are rankings: 4 is best, R D (1,4) (3,3) 1 is worst. Look at their strategies. From R s POV: If B selects A R gets 2 for A, 1 for D. So R chooses A. If B selects D R gets 4 for A, 3 for D. So R chooses A. In both cases, strategy A gives more desirable outcomes than D, thus, it is the dominant strategy. Similarly, B will choose A as well. Thus, when each nation strives to maximize its own payoffs independently, the pair is driven to the outcome (A,A), w/ payoffs of (2,2). The better outcome of (D,D), w/ payoffs of (3,3), appears unobtainable when this game is played noncooperatively. The outcome (A,A), w/c is a product of dominant strategy choices by both players, is a Nash equilibrium (p484). When no player can benefit by departing unilaterally (by itself) from that outcome, that outcome is a Nash equilibrium (ie, moving away from that point loses points for both players). When nations have no great confidence in the trustworthiness of other nations, they would have good reason to try to protect themselves against the other s defection by arming themselves. Prisoners dilemma is a 2-person variable-sum game in w/c each player has 2 strategies: cooperate or defect. 4

Defect dominates cooperate for both players, even though the mutual-defection outcome the unique Nash equil is worse for both players than the mutual-cooperation outcome. If we add the numerical payoffs: top left: 2 + 2 = 4; bottom right: 3 + 3 = 6; other corners: 1 + 4 = 5; illustrating the variable-sum game. B. hicken 2 drivers approach each other at high speed. ach must decide at the last minute to swerve to the right or not to swerve. Possible consequences are: 1. Neither driver swerves, the cars collide head-on, the worst outcome for both drivers are killed (payoff 1); 2. Both drivers swerve, each is mildly disgraced for chickening out, but they survive, next best outcome for both (payoff 3); 3. One driver swerves and badly loses face, his next worst outcome (payoff 2); the other does not swerve and is perceived as the winner, her best outcome (payoff 4). Driver2 If both drivers persist to win w/ payoff 4 S ~S by not swerving, S (3,3) (2,4) the result will be mutual disaster w/ payoff 1. Driver2 ~S (4,2) (1,1) Better for both to back down and each obtains 3 by swerving, but neither wants to be in the position of being chicken w/ payoff 2 when the other does not (payoff 4). Thus, neither player has a dominant strategy. His/her better strategy depends on what the other player does: Swerve if the other does not, don t swerve if the other does, making this game highly interdependent, w/c is characteristic of many games. There are 2 Nash equilibria -- (4,2) and (2,4) -- making the compromise (3,3) not easy to achieve b/c both players will have an incentive to deviate in order to try to be the winner. International crises, labor-management disputes, and other conflicts in w/c escalating demands may end in wars, strikes, and other catastrophic outcomes have been modeled by the game of hicken. [There are other 78 different 2 x 2 ordinal games in w/c each player ranks the 4 possible outcomes from best to worst.] III. Larger games: truel A truel is like a duel, except there are 3 players. ach player can either fire, or not fire, at either of the 2 players.the goal of each player is, 1 st, to survive, and, 2 nd, to survive w/ as few players as possible. ach player has one bullet, and is a perfect shot; no communication (eg, to pick out a common target) leading to a binding agreement w/ other players is allowed, making the game noncoope-rative. There are 2 cases: (1) Simultaneous shots. At the start of play, each player will fire at one of the other 2 players, killing that player. Their own survival does not depend on their actions, they can only affect what happens to the others. They cannot shoot at themselves b/c of goal #1; they shoot at the others by goal #2. 2 possible outcomes: no player survives if each one fires at a different target; or, one player survives if 2 players fire at the same target, that target firing at one of them. (2) Sequential shots. If choices are sequential, ie, before the game starts, some random assignments have been made who shoots first, then next, then last. But, no player will choose to fire at any other, so all will survive. At the start of the game, everybody s alive. If A contemplates shooting B, thus killing her, he will be left defenseless to, who will then shoot him, making the only survivor. The same argument works for everybody, so that nobody will shoot at anybody else, making everybody live. 5

Remarks: (1) In the simultaneous case, if shooting to deliberately miss is allowed, then you can increase your survival if you shoot in the air, making sure your opponents see you do this. Losing your bullet, the others perceive you as no threat to them, you hope they will leave you alone and try to kill each other. There is some element of risk, but you hope you don t antagonize them any further. (2) Sequential choices produce a happier outcome (where everybody survives). They do provide a plausible model of a strategic situation that mimics what people might actually think and do. Like the players in a truel, people would be motivated to think ahead before saying or doing anybody harm, given the dire consequences of their actions. They would hold their fire at anyone, knowing that if they fired first, they would be the next target, as in gossip, or malicious mischief to a friend or neighbor. This analysis suggests that truels by thinking ahead -- might be more effective than duels in preventing the outbreak of conflict. ***** Final xam: Thursday 6/23/11, pm. 6