Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Similar documents
Game playing. Chapter 5, Sections 1 6

CS 331: Artificial Intelligence Adversarial Search II. Outline

Game Playing: Adversarial Search. Chapter 5

Game playing. Chapter 6. Chapter 6 1

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

Game-Playing & Adversarial Search

Game playing. Chapter 5. Chapter 5 1

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

Game playing. Chapter 6. Chapter 6 1

Game Playing. Philipp Koehn. 29 September 2015

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

Programming Project 1: Pacman (Due )

CS 380: ARTIFICIAL INTELLIGENCE

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Artificial Intelligence. Minimax and alpha-beta pruning

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Games and Adversarial Search

Game Playing. Dr. Richard J. Povinelli. Page 1. rev 1.1, 9/14/2003

Games vs. search problems. Adversarial Search. Types of games. Outline

Game playing. Outline

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

Adversarial Search Lecture 7

ADVERSARIAL SEARCH. Chapter 5

Adversarial search (game playing)

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Games CSE 473. Kasparov Vs. Deep Junior August 2, 2003 Match ends in a 3 / 3 tie!

CS 188: Artificial Intelligence

Intuition Mini-Max 2

COMP219: Artificial Intelligence. Lecture 13: Game Playing

Lecture 5: Game Playing (Adversarial Search)

Artificial Intelligence. Topic 5. Game playing

School of EECS Washington State University. Artificial Intelligence

Game Playing Beyond Minimax. Game Playing Summary So Far. Game Playing Improving Efficiency. Game Playing Minimax using DFS.

Artificial Intelligence Adversarial Search

Today. Nondeterministic games: backgammon. Algorithm for nondeterministic games. Nondeterministic games in general. See Russell and Norvig, chapter 6

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

Adversary Search. Ref: Chapter 5

ARTIFICIAL INTELLIGENCE (CS 370D)

CS 771 Artificial Intelligence. Adversarial Search

Game Engineering CS F-24 Board / Strategy Games

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

CS 4700: Foundations of Artificial Intelligence

Adversarial Search and Game Playing

Multiple Agents. Why can t we all just get along? (Rodney King)

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

Ch.4 AI and Games. Hantao Zhang. The University of Iowa Department of Computer Science. hzhang/c145

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Ar#ficial)Intelligence!!

CS 188: Artificial Intelligence Spring Game Playing in Practice

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Adversarial Search (Game Playing)

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016 Tim French

Game Playing AI Class 8 Ch , 5.4.1, 5.5

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

Adversarial Search. Chapter 5. Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro) 1

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Adversarial Search Aka Games

Adversarial Search. CMPSCI 383 September 29, 2011

Game playing. Chapter 5, Sections 1{5. AIMA Slides cstuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 1

Artificial Intelligence Search III

CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 4: Search 3.

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

Foundations of Artificial Intelligence

Artificial Intelligence

Game Playing State-of-the-Art

2/5/17 ADVERSARIAL SEARCH. Today. Introduce adversarial games Minimax as an optimal strategy Alpha-beta pruning Real-time decision making

Artificial Intelligence

CS 188: Artificial Intelligence Spring Announcements

Data Structures and Algorithms

Foundations of Artificial Intelligence

CSE 473: Artificial Intelligence. Outline

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

Games (adversarial search problems)

Adversarial Search 1

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

Artificial Intelligence

Game Playing. Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial.

CS 5522: Artificial Intelligence II

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

CS510 \ Lecture Ariel Stolerman

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

DIT411/TIN175, Artificial Intelligence. Peter Ljunglöf. 2 February, 2018

Adversarial Search: Game Playing. Reading: Chapter

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

game tree complete all possible moves

CSE 573: Artificial Intelligence

Foundations of Artificial Intelligence

CPS331 Lecture: Search in Games last revised 2/16/10

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

Lecture 33: How can computation Win games against you? Chess: Mechanical Turk

Game-playing AIs: Games and Adversarial Search I AIMA

Transcription:

Outline Game Playing ECE457 Applied Artificial Intelligence Fall 2007 Lecture #5 Types of games Playing a perfect game Minimax search Alpha-beta pruning Playing an imperfect game Real-time Imperfect information Chance Russell & Norvig, chapter 6 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 2 Game Problems Games are well-defined search problems Well-defined board configurations (states) Limited set of well-defined moves (actions) Well-defined victory conditions (goal) Values assigned to pieces, moves, outcomes (cost) that are hard to solve by searching A search tree for chess has an average branching factor of 5 An average chess game lasts for 50 moves per player (ply) The average search tree has 5 100 nodes! ECE457 Applied Artificial Intelligence R. Khoury (2007) Page Game Problems The opponent He wants to win and make our agent lose We have no control over his actions He prevents us from reaching the optimal solution Introduces uncertainty in the search We don t know what moves the opponent will do We will assume perfect play behaviour ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 4 1

Types of Games Game-Playing Strategy Perfect information Imperfect information Deterministic Chess Checkers Go Stratego Battleship Chance Backgammon Monopoly Bridge Poker Scrabble Our agent and the opponent play sequentially We assume the opponent plays perfectly Our agent cannot get to the optimal goal The opponent won t allow it Our agent must find the best achievable goal ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 5 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 6 Payoff (utility) function assigns a value to each leaf node in the tree Value then propagates up to non-leaf nodes Two players MA wants to maximise payoff MIN wants to minimise payoff MA is the player currently looking for a move (i.e. at root of tree) Payoff function Simple 1 = win / 0 = draw / = lose Complex for different victory conditions Win/lose for MA ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 7 O O O O O O ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 8 2

MA MIN 1 2 MA 18 5 1 15 42 56 2-5 Game of Nim Initial state: 7 matches in a pile Each player must divide a pile into two nonempty unequal piles Player who can t do that, loses Payoff +1 win, loss ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 9 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 10 6 5 +1 4 +1 +1 2 4-2 -2 2-2 (max loses) 7 5-2 -2-2 +1 2-2-2 +1 (max wins) +1 (max wins) ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 11 4- - MA MIN MA MIN Generate entire game tree Compute payoff of leaf nodes For each non-leaf node, from the lowest in the tree to the root If MA level, then assign value of the child with maximum payoff If MIN level, then assign value of the child with minimum payoff At the root, select action with maximum payoff The value of each node is MA the value of the best leaf the current player (MA MIN or MIN) can reach. ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 12

Complete, if tree is finite Optimal against a perfect opponent Time complexity = O(b m ) Space complexity = O(bm) But remember, b and m can be huge For chess, b 5 and m 100 Alpha-Beta Pruning MA take the max of its children MIN gives each child the min of its children max(min(,18,5),min(1,15,42),min(56,2,-5)) We don t need to compute the values of all the grandchildren! Only until we find a value lower than the highest child s value max(min(,18,5),min(1,?,?),min(56,2,?)) ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 1 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 14 Alpha-Beta Pruning Maintain values α and β α is the maximum value that MA is assured of at any point in the search β is the minimum value that MIN is assured of at any point in the search Both computed using payoff propagated through the tree Start with α = - and β = As the search goes on, the number of possible values of α and β decreases When β α Current path is not the result of best play by both players, so no need to explore further MA MIN Alpha-Beta Pruning 1. [-, ] [α, β] 4. [, ] 7. [, ] 8. [, 56] 9. [, 2] 5. [, ] 2. [-, ]. [-, ] 1 6. [, 1] 2 MA 18 5 1 56 2 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 15 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 16 4

Alpha-Beta Pruning Called as rootvalue = Evaluate(root, -, ) Evaluate(node, α, β) If node is leaf Return payoff If node is MA v = - For each child of node v = max( v, Evaluate(child, α, β) Break if v β α = max(α, v) Return v If node is MIN v = For each child of node v = min( v, Evaluate(child, α, β) ) Break if v α β = min(β, v) Return v ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 17 Alpha-Beta Pruning Efficiency dependant on ordering of children Will check each of MA s children until finding one with a value higher than beta Will check each of MIN s children until finding one with a value lower than alpha Use heuristics to order the nodes to check Check the highest-value children first for MA Check the lowest-value children first for MIN Good ordering can reduce time complexity to O(b d/2 ) Random ordering gives roughly O(b d/4 ) Minimax is O(b d ) ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 18 Minimax Exercise Pruning Exercise B 5 A 5 C 2 B 2.[-, ].[-, 6] 4.[-, 5] 1.[-, ] 5.[5, ] A C 6.[5, ] 11.[5, 8] 14.[5, 4] D E F 8 G 9 H 6 5 4 I 2 7.[-, ] 12.[-, 8] D E F 8.[8, ] G H 1.[9, 8] 6 5 4 I J K 0 L M 8 9 1 J K 9.[8, ] L 8 10.[8, 0] 9 M 0 17 N O 0 N O ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 19 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 20 5

Imperfect Play Real-time or time constraints Chance Hidden information Real-Time Games Sometimes we can t search the entire tree Real-time games Time constraints (playing against a clock) Tree too big (e.g. chess) ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 21 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 22 Real-Time Games Evaluation function Estimate value of a non-leaf node in the tree Cut off search at a given level O < Chess: count value of pieces, available moves, board configurations, ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 2 O Real-Time Generate entire game tree down to maximum number of ply Evaluate lowest nodes For each non-leaf node, from the lowest in the tree to the root If MA level, then assign value of the child with maximum payoff If MIN level, then assign value of the child with minimum payoff At the root, select action with maximum payoff ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 24 6

Real-Time Alpha-Beta Pruning Called as rootvalue = Evaluate(root, -, ) Evaluate(node, α, β) If node is at lowest level Return evaluation If node is MA v = - For each child of node v = max( v, Evaluate(child, α, β) Break if v β α = max(α, v) Return v If node is MIN v = For each child of node v = min( v, Evaluate(child, α, β) ) Break if v α β = min(β, v) Return v ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 25 Real-Time Games: Problems Non-quiescent positions Some board configurations cause value to change wildly Solved with quiescence search Expand non-quiescent boards deeper, until you reach stable quiescent boards Horizon effect A singular move is considerably better than all others But a damaging unavoidable move is (or can be pushed) just beyond the search depth limit (the horizon ) Solved with singular extension Expand singular state deeper ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 26 Games of Chance Minimax requires planning for upcoming moves If moves depend on dice rolls, random draws, etc., planning is impossible We need to add all possible outcomes in the tree! Recall 1 2 18 5 1 15 42 56 2-5 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 27 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 28 7

Expectiminimax MA has already rolled the dice and has three possible moves 4.45 Then, MIN rolls the dice There are three possible outcomes to the roll 4.45 4.15 0.45 0.8 0.05 0.8 0.05 0.8 0.05 16-7 1 25-8 2-25 58 And MIN picks an action based on the roll result ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 29 Expectiminimax 4.45 0.8 16 4.45 0.05 4.15 0.45 0.8 0.05 0.8 0.05 0.8 0.05 7 12 16 22 1 25-8 2-25 58 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 0-7 -7-4 17 Problems with Expectiminimax 26.65 4.45 4.15 26.65 0.8 0.05 0.8 0.05 0.8 0.05 16-7 1 25-8 2-25 800 Problems with Expectiminimax Time complexity: O(b m n m ) n is the number of possible outcomes of a chance node Recall: minimax is O(b m ) Trees can grow very large very quickly Minimax & pruning limits search to likely sequences of actions given perfect play With randomness, there is no likely sequence of actions ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 1 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 2 8

Imperfect Information Imperfect Information Algorithms so far require knowing everything about the game In some games, information about the opponent is hidden Cards in poker, pieces in Stratego, etc. We could approximate hidden information to random events The probability that the opponent has a flush, the probability that a piece is a bomb, etc. Then use expectiminimax to get best action 1 2 a b List all possible outcomes, then average best action overall Can lead to irrational behaviour! Possible cases: Road 1 leads to money, road 2-a leads to gold, road 2-b leads to death (rational action is road 2, then a) Road 1 leads to money, road 2-a leads to death, road 2-b leads to gold (rational action is road 2, then b) But the real situation is: Road 1 leads to money, road 2 leads to gold or death (rational action is road 1) ECE457 Applied Artificial Intelligence R. Khoury (2007) Page ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 4 Imperfect Information It s a useful approximation, but it s not exact! Pros: Works in many cases Doesn t require new techniques to handle information discovery Cons: In reality, hidden information is not the same as random events Can lead to irrational behaviour Imperfect Information Need to handle information Gather information Plan based on what information we will have at a given point in the future Leads to more rational behaviour Acting to gain information Acting to give information to partners Acting to conceal information from the opponents We will learn to do that later in the course ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 5 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 6 9

IBM Deep Blue First chess computer to defeat a reigning world champion (Garry Kasparov) under normal chess tournament constraints in 1997 Relied on brute hardware search power 0 processors for the search 480 custom VLSI chess processors for move generation and ordering, and leaf node evaluation IBM Deep Blue Searched a minimax tree 100-200M states per second, maximum 0M Average 6 to 16 ply, maximum 40 ply Decide which moves are worth expanding, giving priority to singular expansion and chess threats Null-window alpha-beta pruning Alpha-beta pruning but limited to a window of moves rather than the entire tree Faster and easier to implement on hardware Approximate, can only returns bounds on the minimax value Allows for a highly non-uniform, more selective and human-like search of the tree ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 7 ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 8 IBM Deep Blue Two board evaluation heuristics Fast evaluation to get a quick approximate value Considers piece position value Slow evaluation to get an exact value Considers 8,000 features Includes common chess concepts and specific Kasparov strategies Features have programmable weights learned automatically from 700,000 grandmaster games and fine-tuned manually by a chess grandmaster ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 9 Assumptions Utility-based agent Environment Fully observable Deterministic Sequential Static Discrete / Continuous Single agent ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 40 10

Assumptions Updated Utility-based agent Environment Fully observable / Partially observable (approximation) Deterministic / Strategic / Stochastic Sequential Static / Semi-dynamic Discrete / Continuous Single agent / Multi-agent ECE457 Applied Artificial Intelligence R. Khoury (2007) Page 41 11