Artificial Intelligence. Minimax and alpha-beta pruning

Similar documents
Artificial Intelligence Adversarial Search

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

CS 771 Artificial Intelligence. Adversarial Search

Adversarial Search and Game Playing

Game-Playing & Adversarial Search

Adversarial Search: Game Playing. Reading: Chapter

ARTIFICIAL INTELLIGENCE (CS 370D)

Adversarial Search Lecture 7

CS 331: Artificial Intelligence Adversarial Search II. Outline

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Game playing. Chapter 5. Chapter 5 1

Ch.4 AI and Games. Hantao Zhang. The University of Iowa Department of Computer Science. hzhang/c145

Game Playing AI Class 8 Ch , 5.4.1, 5.5

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence Spring Announcements

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

Ar#ficial)Intelligence!!

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Game playing. Chapter 6. Chapter 6 1

Games CSE 473. Kasparov Vs. Deep Junior August 2, 2003 Match ends in a 3 / 3 tie!

CS 188: Artificial Intelligence Spring Game Playing in Practice

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Adversarial Search (Game Playing)

Game Playing. Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial.

Lecture 5: Game Playing (Adversarial Search)

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Programming Project 1: Pacman (Due )

Game playing. Chapter 5, Sections 1 6

Adversarial search (game playing)

Adversarial Search Aka Games

Foundations of AI. 6. Board Games. Search Strategies for Games, Games with Chance, State of the Art

ADVERSARIAL SEARCH. Chapter 5

Artificial Intelligence 1: game playing

Artificial Intelligence. Topic 5. Game playing

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

Game-playing AIs: Games and Adversarial Search I AIMA

Adversarial Search 1

Adversarial Search. CMPSCI 383 September 29, 2011

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

Game playing. Outline

CS 4700: Foundations of Artificial Intelligence

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

Games (adversarial search problems)

Game Playing: Adversarial Search. Chapter 5

CS 380: ARTIFICIAL INTELLIGENCE

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 4: Search 3.

Game playing. Chapter 6. Chapter 6 1

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

CPS331 Lecture: Search in Games last revised 2/16/10

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

CS 5522: Artificial Intelligence II

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

Game Playing. Dr. Richard J. Povinelli. Page 1. rev 1.1, 9/14/2003

Adversary Search. Ref: Chapter 5

Artificial Intelligence Search III

Game Playing. Philipp Koehn. 29 September 2015

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

Game Playing State-of-the-Art

CS 188: Artificial Intelligence. Overview

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Intuition Mini-Max 2

CS 188: Artificial Intelligence Spring 2007

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

Artificial Intelligence

Foundations of Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

COMP219: Artificial Intelligence. Lecture 13: Game Playing

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

CSE 573: Artificial Intelligence Autumn 2010

Games and Adversarial Search

Artificial Intelligence

Foundations of Artificial Intelligence

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

game tree complete all possible moves

Artificial Intelligence

CSE 473: Artificial Intelligence. Outline

CS 188: Artificial Intelligence

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

Games vs. search problems. Adversarial Search. Types of games. Outline

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Game Playing AI. Dr. Baldassano Yu s Elite Education

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

Adversarial Search. Chapter 5. Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro) 1

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

2 person perfect information

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

School of EECS Washington State University. Artificial Intelligence

Transcription:

Artificial Intelligence Minimax and alpha-beta pruning

In which we examine the problems that arise when we try to plan ahead to get the best result in a world that includes a hostile agent (other agent planning against us). 2

Games Adversarial search problems Competitive environments in which goals of multiple agents are in conflict (often known as games) Game theory Views any multi-agent environment as game Provided the impact of each agent on the others is significant Game playing is idealization of worlds in which hostile agents act so as to diminish one s wellbeing! Games problems are like real world problems Classic AI games Deterministic, turn-taking, two-player, perfect information 3

Classic AI Games State of game easy to represent Agents usually restricted to fairly small number of well-defined actions Opponent introduces uncertainty Games usually much too hard to solve Chess Branching factor 35 Often go to 50 moves by each player About 35 100 nodes! Good domain to study 4

AI Game Play Define optimal move and algorithm for finding it Ignore portions of search tree that make no difference to final choice Pruning 5

A Game Defined as Search Problem Initial state Board position Whose move it is Operators (successor function) Defines legal moves and resulting states Terminal (goal) test Determines when game is over (terminal states) Utility (objective, payoff) function Gives numeric value for the game outcome at terminal states e.g., {win = +1, loss = -1, draw = 0} 6

X O X X X X X X X X X X X O X O X O O Terminal X O X X O X O X X X O X X O X X O X O X O O X X Partial search tree for game Tic-Tac-Toe (you are X ) O X X O X O O Utility -1 0 +1 7

Optimal Strategies: Perfect Decisions in Two-Person Games Two players MAX MIN (Assume) MAX moves first, then they take turns moving until game over At end, points awarded to winning player Or penalties given to loser Can formulate this gaming structure into a search problem 8

An Opponent If were normal search problem, then MAX (you/agent) need only search for sequence of moves leading to winning state But, MIN (the opponent) has input MAX must use a strategy that will lead to a winning state regardless of what MIN does Strategy picks best move for MAX for each possible move by MIN 9

MAX (X) MIN (O) X X X X X X X X X MAX (X) X O X O X O MIN (O) X O X X O X O X X Partial search tree for game Tic-Tac-Toe Terminal X O X O X X O X O O X X O X X O X X O X O O Utility -1 0 +1 10

Techniques Minimax Determines the best moves for MAX, assuming that MAX and opponent (MIN) play perfectly MAX attempts to maximize its score MIN attempts to minimize MAX s score Decides best first move for MAX Serves as basis for analysis of games and algorithms Alpha-beta pruning Ignore portions of search tree that make no difference to final choice 11

Playing Perfectly? [The game hasn't begun.] FRATBOT #2 Mate in 143 moves. FRATBOT #3 Oh, poo, you win again! Futurama, episode Mars University 12

Minimax Perfect play for deterministic, perfect-information games Two players: MAX, MIN MAX moves first, then take turns until game is over Points are awarded to winner Sometimes penalties may be given to loser Choose move to position with highest minimax value Best achievable payoff against best play Maximizes the worst-case outcome for MAX 13

Minimax Algorithm Generate whole game tree (or from current state downward depth-first process online) Initial state(s) to terminal states Apply utility function to terminal states Get payoff for MAX s final move Use utilities at terminal states to determine utility of nodes one level higher in tree Find MIN s best attempt to minimize high payoff for MAX at terminal level Continue backing up the values to the root One layer at a time Value at root is determines the best payoff and opening move for MAX (minimax decision) 14

2-Ply Minimax Game (one move for each player) MAX 3 A 1 A 2 A 3 Action for MAX MIN 3 2 2 Action for MIN Terminal A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33 3 12 8 2 4 6 14 5 2 MAX s final score 15

2-Ply Minimax Game (one move for each player) MAX 3 MIN 3 A 1 A 2 A 3 2 2 Terminal A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33 3 12 8 2 4 6 14 5 2 16

Properties of Minimax Complete If tree is finite Time Depth-first exploration O(b m ), max depth of m with b legal moves at each point (impractical for real games) Space Depth-first exploration O(bm) Optimality Yes against an optimal opponent Does even better when MIN not play optimally 17

Inappropriate Game for Minimax MAX MIN 99 100 Terminal 99 1000 1000 1000 100 101 102 100 Minimax suggest taking right-hand branch (100 better than 99). The 99 most likely an error in payoff estimation. Use probability distribution over nodes. 18

Pruning Minimax search has to search large number of states But possible to compute correct minimax decision without looking at every node in search tree Eliminating a branch of search tree from consideration (without looking at it) is called pruning Alpha-beta pruning Prunes away branches that cannot possibly influence final minimax decision Returns same move as general minimax 19

Alpha-Beta Pruning Can be applied to trees of any depth Often possible to prune entire subtrees rather than just leaves Alpha-beta name Alpha = value of best (highest-value) choice found so far at any choice point along path for MAX In other words, the worst score (lowest) MAX could possibly get Update alpha only during MAX s turn/ply Beta = value of best (lowest-value) choice found so far at any choice point along path for MIN In other words, the worst score (highest) MIN could possibly get Update beta only during MIN s turn/ply 20

Alpha-Beta Pruning MAX MIN m m Alpha MAX 21

Alpha-Beta Pruning MAX MIN m m n Beta MAX n 22

Alpha-Beta Pruning MAX MIN m m n MAX n If m > n Equiv: Alpha > Beta m is best value (to MAX) so far on current path. If n is worse than m, MAX will prune. 23

Alpha-Beta Pruning MAX MIN Terminal 24

Alpha-Beta Pruning MAX 3 MIN 3 Terminal 3 12 8 25

Alpha-Beta Pruning MAX 3 MIN 3 2 Terminal 3 12 8 2 26

Alpha-Beta Pruning MAX 3 MIN 3 2 14 Terminal 3 12 8 2 14 27

Alpha-Beta Pruning MAX 3 MIN 3 2 14 5 Terminal 3 12 8 2 14 5 28

Alpha-Beta Pruning MAX 3 3 MIN 3 2 14 5 2 Terminal 3 12 8 2 14 5 2 29

In-Class Exercise Terminal 30

Node Ordering Good move ordering would improve effectiveness of pruning Try to first examine successors that are likely to be best Prunes faster e.g., want to see children with values ordered as 1, 10, 100 (not 100, 10, 1) vs. 100 10 1 1 10 100 Have better chance of being pruned 31

Properties of Alpha-Beta Pruning does not affect final result With perfect ordering : Time complexity O(b m/2 ) A simple example of the value of reasoning about which computations are relevant Meta-reasoning (reasoning about reasoning) 32

Games with Chance Many games have a random element e.g., throwing dice to determine next move Cannot construct standard game tree as before As in Tic-Tac-Toe Need to include CHANCE nodes Branches leading from chance node represent the possible chance-outcomes and probability e.g., die rolls: each branch has the roll value (1-6) and its chance of occurring (1/6 th ) 33

ExpectiMiniMax TERMINAL, MAX, MIN nodes work same way as before CHANCE nodes are evaluated by taking weighted average of values resulting from all possible chance outcomes (e.g., die rolls) Process is backed-up recursively all the way to root (as before) 34

Simple Example MAX 2 possible start moves for MAX CHANCE MIN TERMINAL A 1 A 2.9x2 +.1x3=2.1.9x1 +.1x4=1.3.9 (H).1 (T).9 (H).1 (T) 2 3 1 4 a 11 a 12 2 2 3 3 1 1 4 4 loaded (trick) coin flip for MIN Move A 1 is expected to be best for MAX 35

Alpha-Beta with Chance? Analysis for MAX and MIN nodes are same But can also prune CHANCE nodes Concept is to use upper bound on value of CHANCE nodes Example: If put bounds on the possible utility values (-3 to +3) underneath, this can be used to put upper bound on the expected value at CHANCE nodes above 36

Game Programs Chess Most attention In 1957, predicted computer would beat world champion in 10 years (off by 40 years) Deep Blue defeated Garry Kasparov (6 game match) The decisive game of the match was Game 2, which left a scar in my memory we saw something that went well beyond our wildest expectations of how well a computer would be able to foresee the long-term positional consequences of its decisions. The machine refused to move to a position that had a decisive short-term advantage showing a very human sense of danger (Kasparov, 1997) 37

Game Programs Chess (con t) Searched 126 million nodes per second on average Peak speed of 330 million nodes per second Generated up to 30 billion positions per move Reaching depth 14 routinely Heart of machine was iterative-deepening alpha-beta search Also generated extensions beyond depth limit for sufficiently interesting lines of moves Later Deep Fritz ended in draw in 2002 against world champion Vladimir Kramnik Ran on ordinary PC (not a supercomputer) 38

Game Programs Others Checkers Othello (Reversi) Smaller search space than chess (5-15 legal moves) Backgammon Neural network system was ranked #3 in world (1992) Bridge Go Branching factor of 361 (chess is 35) Regular search methods no good Programs now exist! 39

Summary Games can be defined as search problems With complexity of real world problems Minimax algorithm determines the best move for a player Assuming the opponent plays perfectly Enumerates entire game tree Alpha-beta algorithm similar to minimax, but prunes away branches that are irrelevant to the final outcome May need to cut off search at some point if too deep Can incorporate chance 40