Lecture 14 Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1
Outline Chapter 5 - Adversarial Search Alpha-Beta Pruning Imperfect Real-Time Decisions Stochastic Games Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 2
Alpha-Beta Pruning As noted at the end of last class, the number of game states that minimax algorithm examines is exponential in the depth of the tree. Can't eliminate the exponent, but can effectively cut number in half, since it is possible to compute the correct minimax decision without looking at every node by pruning the search tree. Consider the game tree from last time. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 3
Minimax Value Recall the definition of the minimax value: Minimax (s) = Utility (s) max aactions(s) Minimax(Result(s,a)) min aactions(s) Minimax(Result(s,a)) if TerminalTest (s) if Player(s) = MAX if Player(s) = MIN Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 4
Pruning Example Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 5
Alpha-Beta Pruning Can also view computation as a simplification of the Minimax formula. Let x and y be the values of the unevaluated successors of node C. Then the value of the root node is: Minimax(root) = max(min(3,12,8), min(2,x,y), min(14,5,2)) = max(3, min(2,x,y), 2) = max(3, z, 2) where z = min(2,x,y) 2 = 3 Shows that value at root is independent of the values of pruned leaves x and y. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 6
Alpha-Beta Pruning When applied to taller trees, this prunes entire subtrees. General principle is: consider a node n somewhere in the tree, such that Player has choice of moving to that node. If Player has a better choice m either at the parent node of n or at any choice point further up, then n will never be reached in actual play. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 7
Alpha-Beta Pruning Alpha-beta pruning gets its name from the parameters that describe the bounds on the backed-up values: = value of the best (i.e., highest-value) choice found so far along the path for MAX = value of the best (i.e., lowest-value) choice found so far along the path for MIN Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 8
Alpha-Beta-Search Algorithm As before, at the root, want the action that has the maximum value Function: Alpha-Beta-Search Receives: state; Returns action 1. v = Max-Value(state, -, +) // initial range [-, +] 2. Return the action in Actions (state) with value v Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 9
Alpha-Beta-Search: Max-Value Function: Max-Value Receives: state,, ; Returns utility value 1. If Terminal-Test (state) then Return Utility(state) 2. v = - 3. For each a in Actions (state) do 3.1 v = Max (v, Min-Value(Result(s,a),, )) 3.2 If v then return v // node is worse than 3.3 = Max (, v) 4. Return v Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 10
Alpha-Beta-Search: Min-Value Function: Min-Value Receives: state,, ; Returns utility value 1. If Terminal-Test (state) then Return Utility(state) 2. v = + 3. For each a in Actions (state) do 3.1 v = Min (v, Max-Value(Result(s,a),, )) 3.2 If v then return v // node is worse than 3.3 = Min (, v) 4. Return v Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 11
Alpha-Beta-Search Algorithm Note that Max-Value and Min-Value are the same as those for the Minimax algorithm except for the bookkeeping code to maintain and. Search updates values for and, and prunes (by terminating the recursive call) when the value of the current call is worse than or for MAX or MIN, respectively. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 12
Pruning Example Again Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 13
Alpha-Beta Pruning Effectiveness of alpha-beta pruning is highly dependent on the order in which state are examined. E.g., could not prune any successors of D at all because the worst successors were generated first. Various ways to order moves. Can use iterative deepening search to find the better (killer) moves. Also can keep track of previously seen states in a transposition table. Results in O(b m/2 ) nodes examined. Search twice as deep as Minimax. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 14
Imperfect Real-Time Decisions Alpha-beta search algorithm still searches all the way to terminal state, which is usually at a depth that is not practical. Shannon (1950) proposed that chess-playing programs should cut off the search earlier and apply a heuristic evaluation function. This effectively turns non-terminal nodes into terminal leaves. Modify Minimax or Alpha-Beta to use a cutoff test and an Eval function. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 15
H-Minimax Value Minimax value using cutoff and Eval: H-Minimax (s, d) = Eval (s) if Cutoff-Test (s, d) max aactions(s) H-Minimax(Result(s,a), d+1) min aactions(s) H-Minimax(Result(s,a), d+1) if Player(s) = MAX if Player(s) = MIN Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 16
Evaluation Functions Evaluation function returns an estimate of the expected utility of the game from a given position. Desired properties: Should order the terminal states in the same way as utility function (otherwise suboptimal) Must be efficient to compute; in particular, faster than computing Minimax value Should be strongly correlated with actual chance of winning. (Uncertainty due to early cutoff.) Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 17
Evaluation Functions Simple evaluation functions are often weighted linear functions of the value of various features of the state. Eval(s) = w 1 f 1 (s) + w 2 f 2 (s) + + w n f n (s) E.g. in chess, pieces have a material value: 1 for pawn, 3 for knight or bishop, 5 for rook, 9 for queen. f i (s) = # of category i pieces. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 18
Evaluation Functions Linear combinations assume that the contribution of each feature is independent of the values of other features. More sophisticated functions also use nonlinear combinations. E.g., bishop is more valuable during the endgame. Note that notion of features and weights is not part of rules of chess. Come from human chess-playing experience. Use machine learning techniques otherwise. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 19
Cutting Off Search Cutoff-Test replaces Terminal-Test in algorithm. It must return true for all terminal states. Easiest implementation is to set a fixed depth limit. The limit d is chosen so that a move is selected within the allotted time. More robust approach is to use iterative deepening. When time runs out, program returns the move selected by the deepest completed search. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 20
Search vs. Lookup Beginnings and endings of games are usually stored in lookup tables rather than generated. For chess openings, the expertise of humans is copied from books. For chess endgames, computers have been used to solve all endgames involving small numbers of pieces (currently 6 pieces). Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 21
Stochastic Games Many games add a random element, such as throwing dice. Called stochastic games. E.g. backgammon uses dice to determine legal moves. (White moves towards 25; Black moves towards 0.) Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 22
Stochastic Games White knows what its legal moves are, but not what Black's legal moves will be, so cannot do standard minimax tree. Add chance nodes. Branches from chance nodes labeled with possible dice rolls and probability of each roll. E.g. rolling 1,1 has probability of 1/36. Since 5-6 is same as 6-5, 21 distinct rolls with doubles having probability of 1/36, the rest having probability 1/18. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 23
Stochastic Games Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 24
Expectiminimax Value Use probabilities to computed an expectiminimax value for games with chance nodes. Expectiminimax (s) = Utility (s) max aactions(s) Expectiminimax(Result(s,a)) min aactions(s) Expectiminimax(Result(s,a)) r P(r)Expectiminimax(Result(s,r)) if TerminalTest (s) if Player(s) = MAX if Player(s) = MIN if Player(s) = CHANCE Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 25
Evaluation Functions with Chance As before, need to use cutoff and evaluation function. However, not as straightforward. E.g., left tree best move is a 1, a 2 for right tree, even though leaf values are ordered the same. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 26
Evaluation Functions with Chance To avoid this problem, evaluation function must be a positive linear transformation of the probability of winning from a position. Addition of chance makes complexity O(b m n m ) where n is number of distinct rolls. For backgammon, b is around 20 and n is 21, so 3 plies is about as deep as can get. Might use Monte Carlo simulation to determine the value of a position. Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 27