Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Similar documents
Adversarial Search. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

Game Playing State-of-the-Art

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

CS 5522: Artificial Intelligence II

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Programming Project 1: Pacman (Due )

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

CSE 473: Artificial Intelligence Fall Outline. Types of Games. Deterministic Games. Previously: Single-Agent Trees. Previously: Value of a State

CS 188: Artificial Intelligence

Artificial Intelligence

Adversarial Search 1

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Adversarial Search Lecture 7

Game Playing State of the Art

CS 188: Artificial Intelligence Spring Announcements

CSE 573: Artificial Intelligence

CS 188: Artificial Intelligence

CSE 473: Ar+ficial Intelligence

CSE 473: Artificial Intelligence. Outline

CSE 40171: Artificial Intelligence. Adversarial Search: Game Trees, Alpha-Beta Pruning; Imperfect Decisions

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

CSE 40171: Artificial Intelligence. Adversarial Search: Games and Optimality

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

CS 188: Artificial Intelligence. Overview

Lecture 5: Game Playing (Adversarial Search)

Game Playing. Philipp Koehn. 29 September 2015

CS 188: Artificial Intelligence Spring 2007

Game playing. Chapter 5, Sections 1 6

Adversarial Search and Game Playing

Path Planning as Search

Games and Adversarial Search

ARTIFICIAL INTELLIGENCE (CS 370D)

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Project 1. Out of 20 points. Only 30% of final grade 5-6 projects in total. Extra day: 10%

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 4: Search 3.

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Adversarial Games. Deterministic Games.

CS 188: Artificial Intelligence

Game-playing AIs: Games and Adversarial Search I AIMA

Artificial Intelligence. Minimax and alpha-beta pruning

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

CSE 573: Artificial Intelligence Autumn 2010

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

Foundations of Artificial Intelligence

Adversary Search. Ref: Chapter 5

Artificial Intelligence

CS 331: Artificial Intelligence Adversarial Search II. Outline

Artificial Intelligence Adversarial Search

Game playing. Chapter 5. Chapter 5 1

ADVERSARIAL SEARCH. Chapter 5

Game-Playing & Adversarial Search

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

CS 771 Artificial Intelligence. Adversarial Search

Game playing. Chapter 6. Chapter 6 1

CS 4700: Foundations of Artificial Intelligence

CS 380: ARTIFICIAL INTELLIGENCE

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

CS188 Spring 2014 Section 3: Games

Artificial Intelligence

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

Local Search. Hill Climbing. Hill Climbing Diagram. Simulated Annealing. Simulated Annealing. Introduction to Artificial Intelligence

Announcements. CS 188: Artificial Intelligence Fall Today. Tree-Structured CSPs. Nearly Tree-Structured CSPs. Tree Decompositions*

Announcements. CS 188: Artificial Intelligence Fall Local Search. Hill Climbing. Simulated Annealing. Hill Climbing Diagram

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

School of EECS Washington State University. Artificial Intelligence

Artificial Intelligence 1: game playing

Ar#ficial)Intelligence!!

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

2 person perfect information

Game playing. Chapter 6. Chapter 6 1

Artificial Intelligence

Game Playing: Adversarial Search. Chapter 5

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

mywbut.com Two agent games : alpha beta pruning

CS 188: Artificial Intelligence Spring Game Playing in Practice

More Adversarial Search

Foundations of Artificial Intelligence

Game Engineering CS F-24 Board / Strategy Games

Game Playing. Dr. Richard J. Povinelli. Page 1. rev 1.1, 9/14/2003

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

Adversarial search (game playing)

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Games vs. search problems. Adversarial Search. Types of games. Outline

Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Foundations of Artificial Intelligence

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Game Playing AI. Dr. Baldassano Yu s Elite Education

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016 Tim French

Transcription:

Adversarial Search Rob Platt Northeastern University Some images and slides are used from: AIMA CS188 UC Berkeley

What is adversarial search? Adversarial search: planning used to play a game such as chess or checkers algorithms are similar to graph search except that we plan under the assumption that our opponent will maximize his own advantage...

Some types of games Chess Solved/unsolved? Checkers Solved/unsolved? Tic-tac-toe Solved/unsolved? Go Solved/unsolved? Outcome of game can be predicted from any initial state assuming both players play perfectly

Examples of adversarial search Chess Unsolved Checkers Solved Tic-tac-toe Solved Go Unsolved Outcome of game can be predicted from any initial state assuming both players play perfectly

Examples of adversarial search Chess Unsolved ~10^40 states Checkers Solved ~10^20 states Tic-tac-toe Solved Less than 9!=362k states Go Unsolved? Outcome of game can be predicted from any initial state assuming both players play perfectly

Different types of games Deterministic / stochastic Two player / multi player? Zero-sum / non zero-sum Perfect information / imperfect information

Different types of games Deterministic / stochastic Two player / multi player? Zero-sum / non zero-sum Zero Sum: utilities of all players sum to zero pure competition Perfect information / imperfect information Non-Zero Sum: utility function of each play could be arbitrary optimal strategies could involve cooperation

Given: Formalizing a Game Calculate a policy: Action that player p should take from state s

Given: Formalizing a Game How? Calculate a policy: Action that player p should take from state s

How solve for a policy? Use adversarial search! build a game tree

This is a game tree for tic-tac-toe

This is a game tree for tic-tac-toe You

This is a game tree for tic-tac-toe You Them

This is a game tree for tic-tac-toe You Them You

This is a game tree for tic-tac-toe You Them You Them

This is a game tree for tic-tac-toe You Them You Them Utility

What is Minimax? Consider a simple game: 1. you make a move 2. your opponent makes a move 3. game ends

What is Minimax? Consider a simple game: 1. you make a move 2. your opponent makes a move 3. game ends What does the minimax tree look like in this case?

What is Minimax? Max (you) Consider a simple game: 1. you make a move 2. your opponent makes a move 3. game ends What does the minimax tree look like in this case? Min (them) Max (you) 3 12 8 2 4 6 14 5 2

What is Minimax? These are terminal utilities assume we know what these values are Max (you) Min (them) Max (you) 3 12 8 2 4 6 14 5 2

What is Minimax? Max (you) Min (them) 3 2 2 Max (you) 3 12 8 2 4 6 14 5 2

What is Minimax? Max (you) 3 Min (them) 3 2 2 Max (you) 3 12 8 2 4 6 14 5 2

What is Minimax? Max (you) Min (them) 3 2 2 3 This is called backing up the values Max (you) 3 12 8 2 4 6 14 5 2

Minimax Okay so we know how to back up values... but, how do we construct the tree? 3 12 8 2 4 6 14 5 2 This tree is already built...

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense.

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense.

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 12

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 12 8

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 3 12 8

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 3 12 8

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 2 3 12 8 2 4 6

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. 3 3 2 2 3 12 8 2 4 6 14 5 2

Minimax Notice that we only get utilities at the bottom of the tree therefore, DFS makes sense. since most games have forward progress, the distinction between tree search and graph search is less important

Minimax

Minimax properties Is it always correct to assume your opponent plays optimally? Max (you) Min (them)? Max (you) 10 10 9 100

Minimax properties Is minimax optimal? Is it complete?

Minimax properties Is minimax optimal? Is it complete? Time complexity =? Space complexity =?

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity =

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity = Is it practical? In chess, b=35, d=100

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity = Is it practical? In chess, b=35, d=100 is a big number...

Minimax properties Is minimax optimal? Is it complete? Time complexity = Space complexity = Is it practical? In chess, b=35, d=100 is a big number... So what can we do?

Evaluation functions Key idea: cut off search at a certain depth and give the corresponding nodes an estimated value. 1-6 1-5 -6 3 1 Cut off recursion here

Evaluation functions Key idea: cut off search at a certain depth and give the corresponding nodes an estimated value. 1 the evaluation function makes this estimate. -6 1-5 -6 3 1 Cut off recursion here

Evaluation functions How does the evaluation function make the estimate? depends upon domain For example, in chess, the value of a state might equal the sum of piece values. a pawn counts for 1 a rook counts for 5 a knight counts for 3...

A weighted linear evaluation function number of pawns on the board number of knights on the board A pawn counts for 1 A knight counts for 3 Eval = 3-2.5=0.5 Eval = 3+2.5+1+1-2.5 = 5

A weighted linear evaluation function number of pawns on the board number of knights on the board A pawn counts for 1 A knight counts for 3 Maybe consider other factors as well? Eval = 3-2.5=0.5 Eval = 3+2.5+1+1-2.5 = 5

Evaluation functions Problem: In realistic games, cannot search to leaves! Solution: Depth-limited search Instead, search only to a limited depth in the tree Replace terminal utilities with an evaluation function for non-terminal positions Example: Suppose we have 100 seconds Can explore 10K nodes / sec So can check 1M nodes per move Guarantee of optimal play is gone More plies makes a BIG difference Use iterative deepening for an anytime algorithm

At what depth do you run the evaluation function? -6 1 1 Option 1: cut off search at a fixed depth Option 2: cut off search at particular states deeper than a certain threshold -5-6 3 1 The deeper your threshold, the less the quality of the evaluation function matters...

Alpha/Beta pruning

Alpha/Beta pruning 3 3 12 8

Alpha/Beta pruning 3 3 12 8

Alpha/Beta pruning 3 3 12 8 2

Alpha/Beta pruning 3 3 12 8 2 4

Alpha/Beta pruning 3 We don't need to expand this node! 3 12 8 2 4

Alpha/Beta pruning 3 We don't need to expand this node! Why? 3 12 8 2 4

Alpha/Beta pruning Max Min 3 We don't need to expand this node! 3 12 8 2 4 Why?

Alpha/Beta pruning Max 3 Min 3 2 2 3 12 8 2 14 5 2

Alpha/Beta pruning So, we don't need to expand these nodes in order to back up correct values! Max 3 Min 3 2 2 3 12 8 2 14 5 2

Alpha/Beta pruning So, we don't need to expand these nodes in order to back up correct values! That's alpha-beta pruning. Max 3 Min 3 2 2 3 12 8 2 14 5 2

Alpha/Beta pruning: algorithm α: MAX s best option on path to root β: MIN s best option on path to root def max-value(state, α, β): initialize v = - for each successor of state: v = max(v, value(successor, α, β)) if v β return v α = max(α, v) return v def min-value(state, α, β): initialize v = + for each successor of state: v = min(v, value(successor, α, β)) if v α return v β = min(β, v) return v

Alpha/Beta pruning (-inf,+inf)

Alpha/Beta pruning (-inf,+inf) (-inf,+inf)

Alpha/Beta pruning Best value for far for MIN along path to root (-inf,+inf) (-inf,3) 3 3

Alpha/Beta pruning Best value for far for MIN along path to root (-inf,+inf) (-inf,3) 3 3 12

Alpha/Beta pruning Best value for far for MIN along path to root (-inf,+inf) (-inf,3) 3 3 12 8

Alpha/Beta pruning Best value for far for MAX along path to root (3,+inf) (-inf,3) 3 3 12 8

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 3 12 8

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 3 12 8 2

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 Prune because value (2) is out of alpha-beta range 3 12 8 2

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 (3,+inf) 3 12 8 2

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 (3,14) 14 3 12 8 2 14

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 (3,5) 5 3 12 8 2 14 5

Alpha/Beta pruning (3,+inf) (-inf,3) (3,+inf) 3 2 (3,5) 2 3 12 8 2 14 5 2

Alpha/Beta algorithm

Is it complete? Alpha/Beta properties

Alpha/Beta properties Is it complete? How much does alpha/beta help relative to minimax? Minimax time complexity = Alpha/beta time complexity >= the improvement w/ alpha/beta depends upon move ordering...

Alpha/Beta properties Is it complete? How much does alpha/beta help relative to minimax? Minimax time complexity = Alpha/beta time complexity >= the improvement w/ alpha/beta depends upon move ordering... The order in which we expand a node. 3 3 2 2 3 12 8 2 4 6 14 5 2

Alpha/Beta properties Is it complete? How much does alpha/beta help relative to minimax? Minimax time complexity = Alpha/beta time complexity >= the improvement w/ alpha/beta depends upon move ordering... The order in which we expand a node. 3 3 2 2 3 12 8 2 4 6 14 5 2 How to choose move ordering? Use IDS. on each iteration of IDS, use prior run to inform ordering of next node expansions.

Expectimax Max (you) Min (them)? Max (you) 10 10 9 100 What if your opponent does not maximize his/her utility? e.g. suppose he/she picks moves uniformly at random?

Expectimax Minimax backup for a rational agent: Max (you) Min (them) 10 9 Max (you) 10 10 9 100

Expectimax Minimax backup for agent who selects actions uniformly at random: Max (you) Min (them) 10 54.5 Max (you) 10 10 9 100

Expectimax Minimax backup for agent who selects actions uniformly at random: Max (you) Min (them) 10 54.5 Max (you) 10 10 9 100 Instead of backing up min values for min-plys, back up the average could also account for agents who are somewhere in between rational and uniformly random. How? later, this idea will be generalized using Markov Decision Processes

0 25 1 2 3 4 5 6 7 8 9 10 11 12 24 23 22 21 20 19 18 17 16 15 14 13 Mixing these ideas: Nondeterministic games Backgammon

Nondeterministic games in general In nondeterministic games, chance introduced by dice, card-shuffling Simplified example with coin-flipping: max chance 3 1 0.5 0.5 0.5 0.5 min 2 4 0 2 2 4 7 4 6 0 5 2