Adversarial Search. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

Similar documents
Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Game Playing State-of-the-Art

CS 5522: Artificial Intelligence II

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

Artificial Intelligence

CS 188: Artificial Intelligence

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Programming Project 1: Pacman (Due )

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

CSE 473: Artificial Intelligence Fall Outline. Types of Games. Deterministic Games. Previously: Single-Agent Trees. Previously: Value of a State

Adversarial Search 1

CSE 473: Ar+ficial Intelligence

CSE 473: Artificial Intelligence. Outline

CS 188: Artificial Intelligence Spring Announcements

Game Playing State of the Art

Adversarial Search Lecture 7

CSE 40171: Artificial Intelligence. Adversarial Search: Game Trees, Alpha-Beta Pruning; Imperfect Decisions

CSE 40171: Artificial Intelligence. Adversarial Search: Games and Optimality

Path Planning as Search

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

ARTIFICIAL INTELLIGENCE (CS 370D)

CS 188: Artificial Intelligence. Overview

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

CS 188: Artificial Intelligence

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

Game-playing AIs: Games and Adversarial Search I AIMA

CSE 573: Artificial Intelligence

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Games and Adversarial Search

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

CS 188: Artificial Intelligence

Game-Playing & Adversarial Search

Adversarial Search and Game Playing

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Adversarial Games. Deterministic Games.

CS 771 Artificial Intelligence. Adversarial Search

Announcements. CS 188: Artificial Intelligence Fall Today. Tree-Structured CSPs. Nearly Tree-Structured CSPs. Tree Decompositions*

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

Adversary Search. Ref: Chapter 5

CS 188: Artificial Intelligence Spring 2007

Foundations of Artificial Intelligence

Game Playing. Philipp Koehn. 29 September 2015

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

Lecture 5: Game Playing (Adversarial Search)

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Artificial Intelligence 1: game playing

Artificial Intelligence. Minimax and alpha-beta pruning

Ar#ficial)Intelligence!!

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Artificial Intelligence Adversarial Search

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

CS 4700: Foundations of Artificial Intelligence

CSE 573: Artificial Intelligence Autumn 2010

Artificial Intelligence

School of EECS Washington State University. Artificial Intelligence

Artificial Intelligence

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

2 person perfect information

Games (adversarial search problems)

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Announcements. CS 188: Artificial Intelligence Fall Local Search. Hill Climbing. Simulated Annealing. Hill Climbing Diagram

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 4: Search 3.

Foundations of Artificial Intelligence

CS188 Spring 2010 Section 3: Game Trees

Game playing. Chapter 5, Sections 1 6

DIT411/TIN175, Artificial Intelligence. Peter Ljunglöf. 2 February, 2018

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

Adversarial Search: Game Playing. Reading: Chapter

Computer Game Programming Board Games

CS325 Artificial Intelligence Ch. 5, Games!

mywbut.com Two agent games : alpha beta pruning

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

Game Engineering CS F-24 Board / Strategy Games

Data Structures and Algorithms

Game Playing AI. Dr. Baldassano Yu s Elite Education

Adversarial Search Aka Games

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Foundations of Artificial Intelligence

Local Search. Hill Climbing. Hill Climbing Diagram. Simulated Annealing. Simulated Annealing. Introduction to Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

CS188 Spring 2010 Section 3: Game Trees

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

Project 1. Out of 20 points. Only 30% of final grade 5-6 projects in total. Extra day: 10%

Game Playing Beyond Minimax. Game Playing Summary So Far. Game Playing Improving Efficiency. Game Playing Minimax using DFS.

CS188 Spring 2014 Section 3: Games

Games and Adversarial Search II

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

CPS331 Lecture: Search in Games last revised 2/16/10

Adversarial Search (Game Playing)

Game Playing AI Class 8 Ch , 5.4.1, 5.5

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Transcription:

Adversarial Search Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

What is adversarial search? Adversarial search: planning used to play a game such as chess or checkers algorithms are similar to graph search except that we plan under the assumption that our opponent will maximize his own advantage...

Examples of adversarial search Chess Checkers Tic-tac-toe Go

Examples of adversarial search Chess Solved/unsolved? Checkers Solved/unsolved? Tic-tac-toe Solved/unsolved? Go Solved/unsolved? Outcome of game can be predicted from any initial state assuming both players play perfectly

Examples of adversarial search Chess Unsolved Checkers Solved Tic-tac-toe Solved Go Unsolved Outcome of game can be predicted from any initial state assuming both players play perfectly

Examples of adversarial search Chess Unsolved ~10^40 states Checkers Solved ~10^20 states Tic-tac-toe Solved Less than 9!=362k states Go Unsolved? Outcome of game can be predicted from any initial state assuming both players play perfectly

Different types of games Deterministic / stochastic Two player / multi player? Zero-sum / non zero-sum Fully observable / partially observable

What is a zero-sum game? Zero-sum: Sum of utilities is zero In the case of a two player game: Pure competition Not zero-sum: Agents have arbitrary utilities Might induce cooperation or competition

A formal definition of a deterministic game Problem: State set: S (start at s0) Players: P={1...N} (usually take turns) Action set: A Transition Function: SxA -> S Terminal Test: S -> {t,f} Terminal Utilities: SxP -> R Solution: Policy, S -> A Objective: Find an optimal policy a policy that maximizes utility assuming that adversary acts optimally.

A formal definition of a deterministic game Problem: State set: S (start at s0) Players: P={1...N} (usually take turns) Action set: A Transition Function: SxA -> S Terminal Test: S -> {t,f} Terminal Utilities: SxP -> R How is this similar/different to the def'n of a standard search problem? Solution: Policy, S -> A Objective: Find an optimal policy a policy that maximizes utility assuming that adversary acts optimally.

A formal definition of a deterministic game Problem: State set: S (start at s0) Players: P={1...N} (usually take turns) Action set: A Transition Function: SxA -> S Terminal Test: S -> {t,f} Terminal Utilities: SxP -> R How do we solve this problem? Solution: Policy, S -> A Objective: Find an optimal policy a policy that maximizes utility assuming that adversary acts optimally.

Adversarial search Image: Berkeley CS188 course notes (downloaded Summer 2015)

This is a game tree for tic-tac-toe Images: AIMA, Berkeley CS188 course notes (downloaded Summer 2015)

This is a game tree for tic-tac-toe You Images: AIMA, Berkeley CS188 course notes (downloaded Summer 2015)

This is a game tree for tic-tac-toe You Them Images: AIMA, Berkeley CS188 course notes (downloaded Summer 2015)

This is a game tree for tic-tac-toe You Them You Images: AIMA, Berkeley CS188 course notes (downloaded Summer 2015)

This is a game tree for tic-tac-toe You Them You Them Images: AIMA, Berkeley CS188 course notes (downloaded Summer 2015)

This is a game tree for tic-tac-toe You Them You Them Utility Images: AIMA, Berkeley CS188 course notes (downloaded Summer 2015)

What is Minimax? Consider a simple game: 1. you make a move 2. your opponent makes a move 3. game ends

What is Minimax? Consider a simple game: 1. you make a move 2. your opponent makes a move 3. game ends What does the minimax tree look like in this case?

What is Minimax? Max (you) Consider a simple game: 1. you make a move 2. your opponent makes a move 3. game ends What does the minimax tree look like in this case? Min (them) Max (you) 3 12 8 2 4 6 14 5 2

What is Minimax? Max (you) Min (them) Max (you) 3 12 8 2 4 6 14 5 2 These are terminal utilities assume we know what these values are

What is Minimax? Max (you) Min (them) 3 2 2 Max (you) 3 12 8 2 4 6 14 5 2

What is Minimax? Max (you) 3 Min (them) 3 2 2 Max (you) 3 12 8 2 4 6 14 5 2

What is Minimax? Max (you) Min (them) 3 2 2 3 This is called backing up the values Max (you) 3 12 8 2 4 6 14 5 2

What is Minimax? Okay so we know how to back up values... but, how do we construct the tree? 3 12 8 2 4 6 14 5 2 This tree is already built...

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense.

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense.

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 12

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 12 8

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 3 12 8

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 3 12 8

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 2 3 12 8 2 4 6

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 3 2 2 3 12 8 2 4 6 14 5 2

What is Minimax? Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. since most games have forward progress, the distinction between tree search and graph search is less important

What is Minimax?

Minimax properties Is it always correct to assume your opponent plays optimally? max min 10 10 9 100 Slide: Berkeley CS188 course notes (downloaded Summer 2015)

Minimax vs expectimax Slide: Berkeley CS188 course notes (downloaded Summer 2015)

Minimax vs expectimax Slide: Berkeley CS188 course notes (downloaded Summer 2015)

Minimax properties Is minimax optimal? Is it complete?

Minimax properties Is minimax optimal? Is it complete? Time complexity =? Space complexity =?

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity =

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity = Is it practical? In chess, b=35, d=100

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity = Is it practical? In chess, b=35, d=100 is a big number...

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity = Is it practical? In chess, b=35, d=100 is a big number... So what can we do?

Evaluation functions Key idea: cut off search at a certain depth and give the corresponding nodes an estimated value. 4 4-2 -1-2 4 9 Cut it off here???? Image: Berkeley CS188 course notes (downloaded Summer 2015)

Evaluation functions Key idea: cut off search at a certain depth and give the corresponding nodes an estimated value. 4-2 -1-2 4 9 4 the evaluation function makes this estimate. Cut it off here???? Image: Berkeley CS188 course notes (downloaded Summer 2015)

Evaluation functions How does the evaluation function make the estimate? depends upon domain For example, in chess, the value of a state might equal the sum of piece values. a pawn counts for 1 a rook counts for 5 a knight counts for 3...

A weighted linear evaluation function number of pawns on the board number of knights on the board A pawn counts for 1 A knight counts for 3

At what depth do you run the evaluation function? -2 4 4 Option 1: cut off search at a fixed depth -1-2 4 9 Option 2: cut off search at quiescient states deeper than a certain threshold Option 3:? The deeper your threshold, the less the quality of the evaluation function matters...????

At what depth do you run the evaluation function? Search depth=2 Slide: Berkeley CS188 course notes (downloaded Summer 2015)

At what depth do you run the evaluation function? Search depth=10 Slide: Berkeley CS188 course notes (downloaded Summer 2015)

Alpha/Beta pruning Image: Berkeley CS188 course notes (downloaded Summer 2015)

Alpha/Beta pruning 3 3 12 8

Alpha/Beta pruning 3 3 12 8

Alpha/Beta pruning 3 3 12 8 2

Alpha/Beta pruning 3 3 12 8 2 4

Alpha/Beta pruning 3 We don't need to expand this node! 3 12 8 2 4

Alpha/Beta pruning 3 We don't need to expand this node! Why? 3 12 8 2 4

Alpha/Beta pruning Max Min 3 We don't need to expand this node! 3 12 8 2 4 Why?

Alpha/Beta pruning Max 3 Min 3 2 2 3 12 8 2 14 5 2

Alpha/Beta pruning So, we don't need to expand these nodes in order to back up correct values! Max 3 Min 3 2 2 3 12 8 2 14 5 2

Alpha/Beta pruning So, we don't need to expand these nodes in order to back up correct values! That's alpha-beta pruning. Max 3 Min 3 2 2 3 12 8 2 14 5 2

Alpha/Beta pruning: algorithm idea General configuration (MIN version) We re computing the MIN-VALUE at some node n MAX We re looping over n s children n s estimate of the childrens min is dropping MIN a Who cares about n s value? MAX Let a be the best value that MAX can get at any choice point along the current path from the root MAX If n becomes worse than a, MAX will avoid it, so we can stop considering n s other children (it s already bad enough that it won t be played) MIN n MAX version is symmetric Slide: Berkeley CS188 course notes (downloaded Summer 2015)

Alpha/Beta pruning: algorithm α: best value so far for MAX along path to root β: best value so far for MIN along path to root def max-value(state, α, β): initialize v = - for each successor of state: v = max(v, value(successor, α, β)) if v β return v α = max(α, v) return v def min-value(state, α, β): initialize v = + for each successor of state: v = min(v, value(successor, α, β)) if v α return v β = min(β, v) return v Slide: adapted from Berkeley CS188 course notes (downloaded Summer 2015)

Alpha/Beta pruning (-inf,+inf)

Alpha/Beta pruning (-inf,+inf) (-inf,+inf)

Alpha/Beta pruning Best value for far for MIN along path to root (-inf,+inf) (-inf,3) 3 3

Alpha/Beta pruning Best value for far for MIN along path to root (-inf,+inf) (-inf,3) 3 3 12

Alpha/Beta pruning Best value for far for MIN along path to root (-inf,+inf) (-inf,3) 3 3 12 8

Alpha/Beta pruning Best value for far for MAX along path to root (3,+inf) (-inf,3) 3 3 12 8

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 3 12 8

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 3 12 8 2

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 Prune because value (2) is out of alpha-beta range 3 12 8 2

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 (3,+inf) 3 12 8 2

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 (3,14) 14 3 12 8 2 14

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 (3,5) 5 3 12 8 2 14 5

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 (3,5) 2 3 12 8 2 14 5 2

Is it complete? Alpha/Beta properties

Alpha/Beta properties Is it complete? How much does alpha/beta help relative to minimax? Minimax time complexity = Alpha/beta time complexity >= the improvement w/ alpha/beta depends upon move ordering...

Alpha/Beta properties Is it complete? How much does alpha/beta help relative to minimax? Minimax time complexity = Alpha/beta time complexity >= the improvement w/ alpha/beta depends upon move ordering... The order in which we expand a node. 3 3 2 2 3 12 8 2 4 6 14 5 2

Alpha/Beta properties Is it complete? How much does alpha/beta help relative to minimax? Minimax time complexity = Alpha/beta time complexity >= the improvement w/ alpha/beta depends upon move ordering... The order in which we expand a node. 3 3 2 2 3 12 8 2 4 6 14 5 2 How to choose move ordering? Use IDS. on each iteration of IDS, use prior run to inform ordering of next node expansions.