Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Similar documents
School of EECS Washington State University. Artificial Intelligence

CS 331: Artificial Intelligence Adversarial Search II. Outline

Game Playing. Philipp Koehn. 29 September 2015

Adversarial Search and Game Playing

CS 188: Artificial Intelligence

Game Playing: Adversarial Search. Chapter 5

Game playing. Chapter 5. Chapter 5 1

Lecture 5: Game Playing (Adversarial Search)

Adversarial search (game playing)

Artificial Intelligence Adversarial Search

Game playing. Chapter 6. Chapter 6 1

Programming Project 1: Pacman (Due )

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

CS 771 Artificial Intelligence. Adversarial Search

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 4: Search 3.

Foundations of Artificial Intelligence

Games and Adversarial Search

Game-Playing & Adversarial Search

Adversarial Search. CMPSCI 383 September 29, 2011

Game playing. Chapter 5, Sections 1 6

DIT411/TIN175, Artificial Intelligence. Peter Ljunglöf. 2 February, 2018

Foundations of Artificial Intelligence

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

Game playing. Chapter 6. Chapter 6 1

CSE 473: Artificial Intelligence. Outline

CS 380: ARTIFICIAL INTELLIGENCE

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

Artificial Intelligence. Topic 5. Game playing

Ar#ficial)Intelligence!!

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Game playing. Outline

Artificial Intelligence. Minimax and alpha-beta pruning

Artificial Intelligence

Adversarial Search: Game Playing. Reading: Chapter

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

CS 5522: Artificial Intelligence II

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

Games CSE 473. Kasparov Vs. Deep Junior August 2, 2003 Match ends in a 3 / 3 tie!

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

Game Playing State-of-the-Art

Adversarial Search Lecture 7

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Games vs. search problems. Adversarial Search. Types of games. Outline

COMP219: Artificial Intelligence. Lecture 13: Game Playing

Adversarial Search. Chapter 5. Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro) 1

Adversarial Search (Game Playing)

CS 188: Artificial Intelligence

Adversarial Search 1

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

Adversarial Search Aka Games

CS 188: Artificial Intelligence Spring Game Playing in Practice

Game Playing State of the Art

Artificial Intelligence

Game Playing. Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial.

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Game Playing. Dr. Richard J. Povinelli. Page 1. rev 1.1, 9/14/2003

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

ADVERSARIAL SEARCH. Chapter 5

CS 4700: Foundations of Artificial Intelligence

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016 Tim French

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Foundations of AI. 6. Board Games. Search Strategies for Games, Games with Chance, State of the Art

CSE 573: Artificial Intelligence

CSE 40171: Artificial Intelligence. Adversarial Search: Games and Optimality

CS 188: Artificial Intelligence Spring 2007

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

CS 188: Artificial Intelligence Spring Announcements

Pengju

Game Playing AI Class 8 Ch , 5.4.1, 5.5

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Intuition Mini-Max 2

Adversarial Search. Chapter 5. Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro, Diane Cook) 1

2/5/17 ADVERSARIAL SEARCH. Today. Introduce adversarial games Minimax as an optimal strategy Alpha-beta pruning Real-time decision making

Ch.4 AI and Games. Hantao Zhang. The University of Iowa Department of Computer Science. hzhang/c145

6. Games. COMP9414/ 9814/ 3411: Artificial Intelligence. Outline. Mechanical Turk. Origins. origins. motivation. minimax search

Artificial Intelligence

CSE 573: Artificial Intelligence Autumn 2010

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

Adversarial Search (a.k.a. Game Playing)

ARTIFICIAL INTELLIGENCE (CS 370D)

Artificial Intelligence

Solving Problems by Searching: Adversarial Search

Local Search. Hill Climbing. Hill Climbing Diagram. Simulated Annealing. Simulated Annealing. Introduction to Artificial Intelligence

Game playing. Chapter 5, Sections 1{5. AIMA Slides cstuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 1

Game Engineering CS F-24 Board / Strategy Games

Transcription:

Adversarial Search CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Outline Game as a search problem Minimax algorithm α-β Pruning: ignoring a portion of the search tree Time limit problem Cut off & Evaluation function 2

Games as search problems Games Adversarial search problems (goals are in conflict) Competitive multi-agent environments Games in AI are a specialized kind of games (in the game theory) 3

Primary assumptions Common games in AI: Two-player Turn taking agents act alternately Zero-sum agents goals are in conflict: sum of utility values at the end of the game is zero or constant Deterministic Perfect information fully observable

Game as a kind of search problem Initial state S 0, set of states (each state contains also the turn), ACTIONS(s), RESULTS s, a like standard search PLAYERS(s): Defines which player takes turn in a state TERMINAL_TEST(s): Shows where game has ended UTILITY(s, p): utility or payoff function U: S P R (how good is the terminal state s for player p) Zero-sum (constant-sum) game: the total payoff to all players is zero (or constant) for every terminal state We have utilities at end of game instead of sum of action costs 5

Game tree (tic-tac-toe) Two players: P 1 and P 2 (P 1 is now searching to find a good move) Zero-sum games: P 1 gets U(t), P 2 gets C U(t) for terminal node t P 1 : X P 2 : O P 1 P 2 1-ply = half move P 1 P 2 6 Utilities from the point of view of P 1

Game tree (tic-tac-toe) Two players: P 1 and P 2 (P 1 is now searching to find a good move) Zero-sum games: P 1 gets U(t), P 2 gets C U(t) for terminal node t 1-ply = half move 7 Utilities from the point of view of PMAX 1

Optimal play Opponent is assumed optimal Minimax function is used to find the utility of each state. MAX/MIN wants to maximize/minimize the terminal payoff MAX gets U(t) for terminal node t 8

Minimax UTILITY(s, MAX) MINIMAX s = max a ACTIONS s MINIMAX(RESULT(s, a)) min a ACTIONS s MINIMAX(RESULT s, a ) Utility of being in state s if TERMINAL_TEST(s) PLAYER s = MAX PLAYER s = MIN MINIMAX(s) shows the best achievable outcome of being in state s (assumption: optimal opponent) 3 3 2 2 9

Minimax (Cont.) Optimal strategy: move to the state with highest minimax value Best achievable payoff against best play Maximizes the worst-case outcome for MAX It works for zero-sum games 10

Minimax algorithm Depth first search function MINIMAX_DECISION(state) returns an action return MIN_VALUE(RESULT(state, a)) max a ACTIONS(state) function MAX_VALUE(state) returns a utility value if TERMINAL_TEST(state) then return UTILITY(state) v for each a in ACTIONS(state) do v MAX(v, MIN_VALUE(RESULTS(state, a))) return v function MIN_VALUE(state) returns a utility value if TERMINAL_TEST(state) then return UTILITY(state) v for each a in ACTIONS(state) do v MIN(v, MAX_VALUE(RESULTS(state, a))) return v 11

Properties of minimax Complete?Yes (when tree is finite) Optimal?Yes (against an optimal opponent) Time complexity: O(b m ) Space complexity: O(bm) (depth-first exploration) For chess, b 35, m > 50 for reasonable games Finding exact solution is completely infeasible 12

Pruning Correct minimax decision without looking at every node in the game tree α-β pruning Branch & bound algorithm Prunes away branches that cannot influence the final decision 13

α-β pruning example 14

α-β pruning example 15

α-β pruning example 16

α-β pruning example 17

α-β pruning example 18

α-β progress 19

α-β pruning Assuming depth-first generation of tree We prune node n when player has a better choice m at (parent or) any ancestor of n Two types of pruning (cuts): pruning of max nodes (α-cuts) pruning of min nodes (β-cuts) 20

Why is it called α-β? α: Value of the best (highest) choice found so far at any choice point along the path for MAX β: Value of the best (lowest) choice found so far at any choice point along the path for MIN Updating α and β during the search process For a MAX node once the value of this node is known to be more than the current β (v β), its remaining branches are pruned. For a MIN node once the value of this node is known to be less than the current α (v α), its remaining branches are pruned. 21

α-β pruning (an other example) 3 3 2 2 5 1 5 22

function ALPHA_BETA_SEARCH(state) returns an action v MAX_VALUE(state,, + ) return the action in ACTIONS(state) with value v function MAX_VALUE(state, α, β) returns a utility value if TERMINAL_TEST(state) then return UTILITY(state) v for each a in ACTIONS(state) do v MAX(v, MIN_VALUE(RESULTS(state, a), α, β)) if v β then return v α MAX(α, v) return v function MIN_VALUE(state, α, β) returns a utility value if TERMINAL_TEST(state) then return UTILITY(state) v + for each a in ACTIONS(state) do v MIN(v, MAX_VALUE(RESULTS(state, a), α, β)) if v α then return v β MIN(β, v) return v 23

Order of moves Good move ordering improves effectiveness of pruning m Best order: time complexity is O(b 2 )? 3m Random order: time complexity is about O(b 4 ) moderate b α-β pruning just improves the search time only partly for 24

Computational time limit (example) 100 secs is allowed for each move (game rule) 10 4 nodes/sec (processor speed) We can explore just 10 6 nodes for each move b m = 10 6, b=35 m=4 (4-ply look-ahead is a hopeless chess player!) 25

Computational time limit: Solution We must make a decision even when finding the optimal move is infeasible. Cut off the search and apply a heuristic evaluation function cutoff test: turns non-terminal nodes into terminal leaves Cut off test instead of terminal test (e.g., depth limit) evaluation function: estimated desirability of a state Heuristic function evaluation instead of utility function This approach does not guarantee optimality. 26

Heuristic minimax H MINIMAX s,d = EVAL(s, MAX) max a ACTIONS s H_MINIMAX(RESULT s, a, d + 1) min a ACTIONS s H_MINIMAX(RESULT s, a, d + 1) if CUTOFF_TEST(s, d) PLAYER s = MAX PLAYER s = MIN 27

Evaluation functions For terminal states, it should order them in the same way as the true utility function. For non-terminal states, it should be strongly correlated with the actual chances of winning. It must not need high computational cost. 28

Evaluation functions based on features Example: features for evaluation of the chess states Number of each kind of piece: number of white pawns, black pawns, white queens,black queens,etc King safety Good pawn structure 29

Evaluation functions Weighted sum of features Assumption: contribution of each feature is independent of the value of the other features EVAL(s) = w 1 f 1 (s) + w 2 f 2 (s) + + w n f n (s) Weights can be assigned based on the human experience or machine learning methods. Example: Chess Features: number of white pawns (f 1 ), number of white bishops (f 2 ), number of white rooks (f 3 ), number of black pawns (f 4 ), Weights: w 1 = 1, w 2 = 3, w 3 = 5, w 4 = 1, 30

Cutting off search: simple depth limit Simple: depth limit d 0 CUTOFF_TEST(s, d) = true false if d > d 0 or TERMINAL_TEST(s) = TRUE otherwise 31

Cutting off search: simple depth limit Problem1: non-quiescent positions Few more plies make big difference in evaluation value Problem 2: horizon effect Delaying tactics against opponent s move that causes serious unavoidable damage (because of pushing the damage beyond the horizon that the player can see) 32

More sophisticated cutting off Cutoff only on quiescent positions Quiescent search: expanding non-quiescent positions until reaching quiescent ones Horizon effect Singular extension: a move that is clearly better than all other moves in a given position. Once reaching the depth limit, check to see if the singular extension is a legal move. It makes the tree deeper but it does not add many nodes to the tree due to few possible singular extensions. 33

Speed up the search process Table lookup rather than search for some states E.g.,for the opening and ending of games (where there are few choices) Example: Chess For each opening, the best advice of human experts (from books describing good plays) can be copied into tables. For endgame, computer analysis is usually used (solving endgames by computer). 34

Stochastic games: Backgammon 35

Stochastic games Expected utility: Chance nodes take average (expectation) over all possible outcomes. It is consistent with the definition of rational agents trying to maximize expected utility. 2.1 1.3 average of the values weighted by their probabilities 2 3 1 4 36

Stochastic games EXPECT_MINIMAX(s) = UTILITY(s, MAX) max a ACTIONS s EXPECT_MINIMAX(RESULT(s, a)) min a ACTIONS s EXPECT_MINIMAX(RESULT(s, a)) P(r) EXPECT_MINIMAX(RESULT s, r ) r if TERMINAL_TEST(s) PLAYER s = MAX PLAYER s = MIN PLAYER s = CHANCE 37

Evaluation functions for stochastic games An order preserving transformation on leaf values is not a sufficient condition. Evaluation function must be a positive linear transformation of the expected utility of a position. 38

Properties of search space for stochastic games O(b m n m ) Backgammon: b 20 (can be up to 4000 for double dice rolls), n = 21 (no. of different possible dice rolls) 3-plies is manageable ( 10 8 nodes) Probability of reaching a given node decreases enormously by increasing the depth (multiplying probabilities) Forming detailed plans of actions may be pointless Limiting depth is not such damaging particularly when the probability values (for each non-deterministic situation) are close to each other But pruning is not straightforward. 39

Search algorithms for stochastic games Advanced alpha-beta pruning Pruning MIN and MAX nodes as alpha-beta Pruning chance nodes (by putting bounds on the utility values and so placing an upper bound on the value of a chance node) Monte Carlo simulation to evaluate a position Starting from the corresponding position, the algorithm plays thousands of games against itself using random dice rolls. Win percentage as the approximated value of the position (Backgammon) 40

State-of-the-art game programs Chess (b 35) In 1997, Deep Blue defeated Kasparov. ran on a parallel computer doing alpha-beta search. reaches depth 14 plies routinely. techniques to extend the effective search depth Hydra: Reaches depth 18 plies using more heuristics. Checkers (b < 10) Chinook (ran on a regular PC and uses alpha-beta search) ended 40- year-reign of human world champion Tinsley in 1994. Since 2007, Chinook has been able to play perfectly by using alpha-beta search combined with a database of 39 trillion endgame positions. 41

State-of-the-art game programs (Cont.) Othello (b is usually between 5 to 15) Logistello defeated the human world champion by six games to none in 1997. Human champions are no match for computers at Othello. Go (b > 300) Human champions refuse to compete against computers (current programs are still at advanced amateur level). MOGO avoided alpha-beta search and used Monte Carlo rollouts. AlphaGo (2016) has beaten professionals without handicaps. 42

State-of-the-art game programs (Cont.) Backgammon (stochastic) TD-Gammon (1992) was competitive with top human players. Depth 2 or 3 search along with a good evaluation function developed by learning methods Bridge (partially observable, multiplayer) In 1998, GIB was 12 th in a filed of 35 in the par contest at human world championship. In 2005, Jack defeated three out of seven top champions pairs. Overall, it lost by a small margin. Scrabble (partially observable & stochastic) In 2006, Quackle defeated the former world champion 3-2. 43