Adversarial Search. Chapter 5. Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro) 1

Similar documents
Adversarial Search. Chapter 5. Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro, Diane Cook) 1

Game Playing. Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial.

Games CSE 473. Kasparov Vs. Deep Junior August 2, 2003 Match ends in a 3 / 3 tie!

Intuition Mini-Max 2

Game playing. Outline

Game playing. Chapter 5. Chapter 5 1

Adversarial search (game playing)

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Game playing. Chapter 6. Chapter 6 1

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

Game Playing. Dr. Richard J. Povinelli. Page 1. rev 1.1, 9/14/2003

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

Game Playing: Adversarial Search. Chapter 5

Game Playing. Philipp Koehn. 29 September 2015

CS 380: ARTIFICIAL INTELLIGENCE

Game playing. Chapter 6. Chapter 6 1

Games vs. search problems. Adversarial Search. Types of games. Outline

Game playing. Chapter 5, Sections 1{5. AIMA Slides cstuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 1

Artificial Intelligence. Topic 5. Game playing

Adversarial Search. CMPSCI 383 September 29, 2011

Local Search. Hill Climbing. Hill Climbing Diagram. Simulated Annealing. Simulated Annealing. Introduction to Artificial Intelligence

Game playing. Chapter 5, Sections 1 6

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

ADVERSARIAL SEARCH. Chapter 5

Adversarial Search (Game Playing)

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Lecture 5: Game Playing (Adversarial Search)

CSE 473: Artificial Intelligence. Outline

Programming Project 1: Pacman (Due )

Announcements. CS 188: Artificial Intelligence Fall Local Search. Hill Climbing. Simulated Annealing. Hill Climbing Diagram

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

Game-Playing & Adversarial Search

ARTIFICIAL INTELLIGENCE (CS 370D)

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

Adversarial Search Lecture 7

CS 188: Artificial Intelligence

Ch.4 AI and Games. Hantao Zhang. The University of Iowa Department of Computer Science. hzhang/c145

CSE 573: Artificial Intelligence Autumn 2010

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring 2007

Adversarial Search Aka Games

CS 331: Artificial Intelligence Adversarial Search II. Outline

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

CS 4700: Foundations of Artificial Intelligence

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 4: Search 3.

Game Playing State of the Art

Game Playing State-of-the-Art

Artificial Intelligence

Adversarial Search and Game Playing

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

CS 188: Artificial Intelligence Spring Game Playing in Practice

Games and Adversarial Search

COMP219: Artificial Intelligence. Lecture 13: Game Playing

Game Playing AI Class 8 Ch , 5.4.1, 5.5

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

CS 5522: Artificial Intelligence II

CS 771 Artificial Intelligence. Adversarial Search

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Artificial Intelligence

Artificial Intelligence Adversarial Search

Artificial Intelligence. Minimax and alpha-beta pruning

Adversary Search. Ref: Chapter 5

Announcements. CS 188: Artificial Intelligence Fall Today. Tree-Structured CSPs. Nearly Tree-Structured CSPs. Tree Decompositions*

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

CS 188: Artificial Intelligence

CSE 573: Artificial Intelligence

CS 188: Artificial Intelligence. Overview

Ar#ficial)Intelligence!!

Adversarial Search 1

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Adversarial Search (a.k.a. Game Playing)

Foundations of AI. 6. Board Games. Search Strategies for Games, Games with Chance, State of the Art

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Foundations of Artificial Intelligence

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

6. Games. COMP9414/ 9814/ 3411: Artificial Intelligence. Outline. Mechanical Turk. Origins. origins. motivation. minimax search

Game-playing AIs: Games and Adversarial Search I AIMA

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Games (adversarial search problems)

Foundations of Artificial Intelligence

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

Artificial Intelligence

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

Adversarial Search: Game Playing. Reading: Chapter

CPS331 Lecture: Search in Games last revised 2/16/10

Chapter 6. Overview. Why study games? State of the art. Game playing State of the art and resources Framework

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

School of EECS Washington State University. Artificial Intelligence

2 person perfect information

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Transcription:

Adversarial Search Chapter 5 Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro) 1

Game Playing Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial. 2. Direct comparison with humans and other computer programs is easy. 2

What Kinds of Games? Mainly games of strategy with the following characteristics: 1. Sequence of moves to play 2. Rules that specify possible moves 3. Rules that specify a payment for each move 4. Objective is to imize your payment 3

Games vs. Search Problems Unpredictable opponent specifying a move for every possible opponent reply Time limits unlikely to find goal, must approximate 4

Opponent s Move Two-Player Game Generate New Position Game Over? no yes Generate Successors Evaluate Successors Move to Highest-Valued Successor no Game Over? yes 5

Games as Adversarial Search States: board configurations Initial state: the board position and which player will move Successor function: returns list of (move, state) pairs, each indicating a legal move and the resulting state Teral test: deteres when the game is over Utility function: gives a numeric value in teral states (e.g., -1, 0, +1 for loss, tie, win) 6

Game Tree (2-player, Deteristic, Turns) computer s turn opponent s turn computer s turn opponent s turn The computer is Max. The opponent is Min. leaf nodes are evaluated At the leaf nodes, the utility function is employed. Big value means good, small is bad. 7

Mini-Max Terology move: a move by both players ply: a half-move utility function: the function applied to leaf nodes backed-up value of a -position: the value of its largest successor of a -position: the value of its smallest successor i procedure: search down several levels; at the bottom level apply the utility function, back-up values all the way up to the root node, and that node selects the move. 8

Mini Perfect play for deteristic games Idea: choose move to position with highest i value = best achievable payoff against best play E.g., 2-ply game: 9

10 Patrick Winston

11 Patrick Winston

12 Patrick Winston

13 Patrick Winston

14 Patrick Winston

15 Patrick Winston

16 Patrick Winston

17 Patrick Winston

18 Patrick Winston

19 Patrick Winston

20 Patrick Winston

Mini Strategy Why do we take the value every other level of the tree? These nodes represent the opponent s choice of move. The computer assumes that the human will choose that move that is of least value to the computer. 21

Mini algorithm Adversarial analogue of DFS 22

Properties of Mini Complete? Yes (if tree is finite) Optimal? Yes (against an optimal opponent) No (does not exploit opponent weakness against suboptimal opponent) Time complexity? O(b m ) Space complexity? O(bm) (depth-first exploration) 23

Chess: Good Enough? branching factor b 35 game length m 100 search space b m 35 100 10 154 The Universe: number of atoms 10 78 age 10 18 seconds 10 8 moves/sec x 10 78 x 10 18 = 10 104 Exact solution completely infeasible 24

Alpha-Beta Procedure The alpha-beta procedure can speed up a depth-first i search. Alpha: a lower bound on the value that a node may ultimately be assigned v > Beta: an upper bound on the value that a imizing node may ultimately be assigned v < 25

26

27

28

29 Patrick Winston

Do we need to check this node??? 30

No - this branch is guaranteed to be worse than what already has X 31

Alpha-Beta MinVal(state, alpha, beta){ if (teral(state)) return utility(state); for (s in children(state)){ child = MaxVal(s,alpha,beta); beta = (beta,child); if (alpha>=beta) return child; } return beta; } alpha = the highest value for MAX along the path beta = the lowest value for MIN along the path 32

Alpha-Beta MaxVal(state, alpha, beta){ if (teral(state)) return utility(state); for (s in children(state)){ child = MinVal(s,alpha,beta); alpha = (alpha,child); if (alpha>=beta) return child; } return beta; } alpha = the highest value for MAX along the path beta = the lowest value for MIN along the path 33

α - the best value for along the path β - the best value for along the path α=- β= α=- β= α=- β= α=- β=84 34

α - the best value for along the path β - the best value for along the path α=- β= α=- β= α=-29 β= α=- β=-29 α=-29 β= 35

α - the best value for along the path β - the best value for along the path α=- β= α=- β= α=-29 β= α=- β=-29 α=-29 β=-37 36

α - the best value for along the path β - the best value for along the path α=- β= α=- β= α=- β=-29 α=-29 β= α=-29 β=-37 β < α prune! X 37

α - the best value for along the path β - the best value for along the path α=- β= α=- β=-29 α=-29 β= α=- β=-29 α=- β=-29 α=-29 β=-37 α=- β=-29 X 38

α - the best value for along the path β - the best value for along the path α=- β= α=- β=-29 α=-29 β= α=- β=-29 α=- β=-29 α=-29 β=-37 α=- β=-29 X 39

α - the best value for along the path β - the best value for along the path α=- β= α=- β=-29 α=-29 β= α=-43 β=-29 α=- β=-29 α=-29 β=-37 α=- β=-43 α=-43 β=-29 X 40

α - the best value for along the path β - the best value for along the path α=- β= α=- β=-29 α=-29 β= α=-43 β=-29 β < α prune! α=- β=-29 α=-29 β=-37 α=- β=-43 α=-43 β=-75 X X 41

α - the best value for along the path β - the best value for along the path α=-43 β= α=- β=-43 α=-29 β= α=-43 β=-29 α=- β=-29 α=-29 β=-37 α=- β=-43 α=-43 β=-75 X X 42

α - the best value for along the path β - the best value for along the path α=-43 β= α=-43 β= α=-43 β= α=-43 β=-21 α=-43 β=58 X X 43

α - the best value for along the path β - the best value for along the path α=-43 β= α=-43 β=-46 β < α prune! α=-43 β= X α=-43 β=-21 α=-43 β=-46 X X X X X X X X 44

Properties of α-β Pruning does not affect final result. This means that it gets the exact same result as does full i. Good move ordering improves effectiveness of pruning With "perfect ordering," time complexity = O(b m/2 ) doubles depth of search A simple example of reasoning about which computations are relevant (a form of metareasoning) 45

Shallow Search Techniques 1. limited search for a few levels 2. reorder the level-1 sucessors 3. proceed with - i search 46

Good Enough? Chess: branching factor b 35 game length m 100 The universe can play chess - can we? search space b m/2 35 50 10 77 The Universe: number of atoms 10 78 age 10 18 seconds 10 8 moves/sec x 10 78 x 10 18 = 10 104 47

Cutting off Search MiniCutoff is identical to MiniValue except 1. Teral? is replaced by Cutoff? 2. Utility is replaced by Eval Does it work in practice? b m = 10 6, b=35 m=4 4-ply lookahead is a hopeless chess player! 4-ply human novice 8-ply typical PC, human master 12-ply Deep Blue, Kasparov 48

Cutoff 49

Evaluation Functions Tic Tac Toe Let p be a position in the game Define the utility function f(p) by f(p) = largest positive number if p is a win for computer smallest negative number if p is a win for opponent RCDC RCDO where RCDC is number of rows, columns and diagonals in which computer could still win and RCDO is number of rows, columns and diagonals in which opponent could still win. 50

Sample Evaluations X = Computer; O = Opponent O X O O X X X rows cols diags X O rows cols diags X O 51

Evaluation functions For chess/checkers, typically linear weighted sum of features Eval(s) = w 1 f 1 (s) + w 2 f 2 (s) + + w n f n (s) e.g., w 1 = 9 with f 1 (s) = (number of white queens) (number of black queens), etc. 52

Example: Samuel s Checker-Playing Program It uses a linear evaluation function f(n) = a 1 x 1 (n) + a 2 x 2 (n) +... + a m x m (n) For example: f = 6K + 4M + U K = King Advantage M = Man Advantage U = Undenied Mobility Advantage (number of moves that Max where Min has no jump moves) 53

Samuel s Checker Player In learning mode Computer acts as 2 players: A and B A adjusts its coefficients after every move B uses the static utility function If A wins, its function is given to B 54

Samuel s Checker Player How does A change its function? 1. Coefficent replacement (node ) = backed-up value(node) initial value(node) if > 0 then terms that contributed positively are given more weight and terms that contributed negatively get less weight if < 0 then terms that contributed negatively are given more weight and terms that contributed positively get less weight 55

Samuel s Checker Player How does A change its function? 2. Term Replacement 38 terms altogether 16 used in the utility function at any one time Terms that consistently correlate low with the function value are removed and added to the end of the term queue. They are replaced by terms from the front of the term queue. 56

Additional Refinements Waiting for Quiescence: continue the search until no drastic change occurs from one level to the next. Secondary Search: after choosing a move, search a few more levels beneath it to be sure it still looks good. Openings/Endgames: for some parts of the game (especially initial and end moves), keep a catalog of best moves to make. 57

Horizon Effect The problem with abruptly stopping a search at a fixed depth is called the 'horizon effect' 58

Chess: Rich history of cumulative ideas Mini search, evaluation function learning (1950). Alpha-Beta search (1966). Transposition Tables (1967). Iterative deepening DFS (1975). End game data bases,singular extensions(1977, 1980) Parallel search and evaluation(1983,1985) Circuitry (1987) 59

Chess game tree 60

Problem with fixed depth Searches if we only search n moves ahead, it may be possible that the catastrophy can be delayed by a sequence of moves that do not make any progress also works in other direction (good moves may not be found) 61

Quiescence Search This involves searching past the teral search nodes (depth of 0) and testing all the non-quiescent or 'violent' moves until the situation becomes calm, and only then apply the evaluator. Enables programs to detect long capture sequences and calculate whether or not they are worth initiating. Expand searches to avoid evaluating a position where tactical disruption is in progress. 62

End-Game Databases Ken Thompson - all 5 piece end-games Lewis Stiller - all 6 piece end-games Refuted common chess wisdom: many positions thought to be ties were really forced wins -- 90% for white Is perfect chess a win for white? 63

The MONSTER White wins in 255 moves (Stiller, 1991) 64

Deteristic Games in Practice Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. Checkers is now solved! Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. Current programs are even better, if less historic! Othello: human champions refuse to compete against computers, who are too good. Go: human champions refuse to compete against computers, who are too bad. In Go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves, along with aggressive pruning. 65

Game of Go human champions refuse to compete against computers, because software is too bad. Chess Go Size of board 8 x 8 19 x 19 100 300 Average no. of moves per game Avg branching factor per turn Additional complexity 35 235 Players can pass 66

Recent Successes in Go MoGo defeated a human expert in 9x9 Go Still far away from 19x19 Go. Hot area of research Leading to development of novel techniques Monte Carlo tree search (UCT) 67

Other Games deteristic chance perfect information chess, checkers, go, othello backgammon, monopoly imperfect information stratego bridge, poker, scrabble 68

Games of Chance What about games that involve chance, such as rolling dice picking a card Use three kinds of nodes: nodes nodes chance nodes chance 69

Games of Chance Expectii c chance node with children d 1 d i d k S(c,d i ) expecti(c) = P(d i ) (backed-up-value(s)) i s in S(c,d i ) expecti(c ) = P(d i ) (backed-up-value(s)) i s in S(c,d i ) 70

Example Tree with Chance chance.4.6 1.2 chance leaf.4.6.4.6 3 5 1 4 1 2 4 5 71

Complexity Instead of O(b m ), it is O(b m n m ) where n is the number of chance outcomes. Since the complexity is higher (both time and space), we cannot search as deeply. Pruning algorithms may be applied. 72

Imperfect Information E.g. card games, where opponents initial cards unknown are Idea: For all deals consistent with what you can see compute the i value of available actions for each of possible deals compute the expected value over all deals 73

Summary Games are fun to work on! They illustrate several important points about AI. Perfection is unattainable must approximate. Game playing programs have shown the world what AI can do. 74