Artificial Intelligence 1: game playing

Similar documents
Adversarial Search and Game Playing

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

CS 771 Artificial Intelligence. Adversarial Search

Game-Playing & Adversarial Search

ADVERSARIAL SEARCH. Chapter 5

CS 331: Artificial Intelligence Adversarial Search II. Outline

ARTIFICIAL INTELLIGENCE (CS 370D)

Game playing. Outline

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

Game playing. Chapter 6. Chapter 6 1

Lecture 5: Game Playing (Adversarial Search)

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

Game Playing: Adversarial Search. Chapter 5

Game playing. Chapter 5. Chapter 5 1

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

Game playing. Chapter 6. Chapter 6 1

Adversarial Search. CMPSCI 383 September 29, 2011

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

Game Playing. Philipp Koehn. 29 September 2015

CS 380: ARTIFICIAL INTELLIGENCE

Adversarial search (game playing)

Game Playing. Dr. Richard J. Povinelli. Page 1. rev 1.1, 9/14/2003

Games vs. search problems. Adversarial Search. Types of games. Outline

Ar#ficial)Intelligence!!

Programming Project 1: Pacman (Due )

Artificial Intelligence

Artificial Intelligence. Topic 5. Game playing

Artificial Intelligence. Minimax and alpha-beta pruning

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

Game playing. Chapter 5, Sections 1{5. AIMA Slides cstuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 1

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

Adversarial Search 1

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 4: Search 3.

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

More Adversarial Search

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Game playing. Chapter 5, Sections 1 6

Games (adversarial search problems)

Ch.4 AI and Games. Hantao Zhang. The University of Iowa Department of Computer Science. hzhang/c145

2/5/17 ADVERSARIAL SEARCH. Today. Introduce adversarial games Minimax as an optimal strategy Alpha-beta pruning Real-time decision making

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Games and Adversarial Search II

COMP9414: Artificial Intelligence Adversarial Search

Path Planning as Search

Artificial Intelligence Adversarial Search

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Adversarial Search: Game Playing. Reading: Chapter

Solving Problems by Searching: Adversarial Search

Adversarial Search (Game Playing)

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

CS 188: Artificial Intelligence

Adversary Search. Ref: Chapter 5

mywbut.com Two agent games : alpha beta pruning

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Artificial Intelligence

Foundations of Artificial Intelligence

2 person perfect information

CS 5522: Artificial Intelligence II

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Game Playing State-of-the-Art

Artificial Intelligence

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Game-playing AIs: Games and Adversarial Search I AIMA

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

Foundations of Artificial Intelligence

Adversarial Search Aka Games

Artificial Intelligence

Adversarial Search Lecture 7

School of EECS Washington State University. Artificial Intelligence

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Adversarial Search (a.k.a. Game Playing)

ADVERSARIAL SEARCH 5.1 GAMES

CS 188: Artificial Intelligence

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

CS 4700: Foundations of Artificial Intelligence

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

CSE 473: Artificial Intelligence Fall Outline. Types of Games. Deterministic Games. Previously: Single-Agent Trees. Previously: Value of a State

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

CPS331 Lecture: Search in Games last revised 2/16/10

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

CS 188: Artificial Intelligence Spring Announcements

Adversarial Search. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

CS 188: Artificial Intelligence Spring 2007

Game Playing. Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial.

CS 331: Artificial Intelligence Adversarial Search. Games we will consider

Games we will consider. CS 331: Artificial Intelligence Adversarial Search. What makes games hard? Formal Definition of a Game.

Chapter Overview. Games

Games and Adversarial Search

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Today. Nondeterministic games: backgammon. Algorithm for nondeterministic games. Nondeterministic games in general. See Russell and Norvig, chapter 6

Transcription:

Artificial Intelligence 1: game playing Lecturer: Tom Lenaerts Institut de Recherches Interdisciplinaires et de Développements en Intelligence Artificielle (IRIDIA) Université Libre de Bruxelles Outline What are games? Optimal decisions in games Which strategy leads to success? α-β pruning Games of imperfect information Games that include an element of chance TLo (IRIDIA) 2 1

What are and why study games? Games are a form of multi-agent environment What do other agents do and how do they affect our success? Cooperative vs. competitive multi-agent environments. Competitive multi-agent environments give rise to adversarial problems a.k.a. games Why study games? Fun; historically entertaining Interesting subject of study because they are hard Easy to represent and agents restricted to small number of actions TLo (IRIDIA) 3 Relation of Games to Search Search no adversary Solution is (heuristic) method for finding goal Heuristics and CSP techniques can find optimal solution Evaluation function: estimate of cost from start to goal through given node Examples: path planning, scheduling activities Games adversary Solution is strategy (strategy specifies move for every possible opponent reply). Time limits force an approximate solution Evaluation function: evaluate goodness of game position Examples: chess, checkers, Othello, backgammon TLo (IRIDIA) 4 2

Types of Games TLo (IRIDIA) 5 Game setup Two players: MAX and MIN MAX moves first and they take turns until the game is over. Winner gets award, looser gets penalty. Games as search: Initial state: e.g. board configuration of chess Successor function: list of (move,state) pairs specifying legal moves. Terminal test: Is the game finished? Utility function: Gives numerical value of terminal states. E.g. win (+1), loose (-1) and draw (0) in tic-tac-toe (next) MAX uses search tree to determine next move. TLo (IRIDIA) 6 3

Partial Game Tree for Tic-Tac-Toe TLo (IRIDIA) 7 Optimal strategies Find the contingent strategy for MAX assuming an infallible MIN opponent. Assumption: Both players play optimally!! Given a game tree, the optimal strategy can be determined by using the minimax value of each node: MINIMAX-VALUE(n)= UTILITY(n) max s successors(n) MINIMAX-VALUE(s) min s successors(n) MINIMAX-VALUE(s) If n is a terminal If n is a max node If n is a max node TLo (IRIDIA) 8 4

Two-Ply Game Tree TLo (IRIDIA) 9 Two-Ply Game Tree TLo (IRIDIA) 10 5

Two-Ply Game Tree TLo (IRIDIA) 11 Two-Ply Game Tree The minimax decision Minimax maximizes the worst-case outcome for max. TLo (IRIDIA) 12 6

What if MIN does not play optimally? Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX. But if MIN does not play optimally, MAX will do even better. [Can be proved.] TLo (IRIDIA) 13 Minimax Algorithm function MINIMAX-DECISION(state) returns an action inputs: state, current state in game v MAX-VALUE(state) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v for a,s in SUCCESSORS(state) do v MAX(v,MIN-VALUE(s)) return v function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v for a,s in SUCCESSORS(state) do v MIN(v,MAX-VALUE(s)) return v TLo (IRIDIA) 14 7

Properties of Minimax Criterion Minimax Complete? Time Space Optimal? Yes O(b m ) O(bm) Yes TLo (IRIDIA) 15 Multiplayer games Games allow more than two players Single minimax values becomes vector TLo (IRIDIA) 16 8

Problem of minimax search Number of games states is exponential to the number of moves. Solution: Do not examine very node ==> Alpha-beta pruning Alpha = value of best choice found so far at any choice point along the MAX path Beta = value of best choice found so far at any choice point along the MIN path Revisit example TLo (IRIDIA) 17 Alpha-Beta Example Do DF-search until first leaf Range of possible values [-,+ ] [-, + ] TLo (IRIDIA) 18 9

Alpha-Beta Example (continued) [-,+ ] [-,3] TLo (IRIDIA) 19 Alpha-Beta Example (continued) [-,+ ] [-,3] TLo (IRIDIA) 20 10

Alpha-Beta Example (continued) [3,+ ] [3,3] TLo (IRIDIA) 21 Alpha-Beta Example (continued) [3,+ ] This node is worse for MAX [3,3] [-,2] TLo (IRIDIA) 22 11

Alpha-Beta Example (continued) [3,14], [3,3] [-,2] [-,14] TLo (IRIDIA) 23 Alpha-Beta Example (continued) [3,5], [3,3] [_,2] [-,5] TLo (IRIDIA) 24 12

Alpha-Beta Example (continued) [3,3] [3,3] [_,2] [2,2] TLo (IRIDIA) 25 Alpha-Beta Example (continued) [3,3] [3,3] [-,2] [2,2] TLo (IRIDIA) 26 13

Alpha-Beta Algorithm function ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game v MAX-VALUE(state, -, + ) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state,α, β) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v - for a,s in SUCCESSORS(state) do v MAX(v,MIN-VALUE(s), α, β) if v β then return v α MAX(α,v) return v TLo (IRIDIA) 27 Alpha-Beta Algorithm function MIN-VALUE(state, α, β) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v + for a,s in SUCCESSORS(state) do v MIN(v,MAX-VALUE(s), α, β) if v α then return v β MIN(β,v) return v TLo (IRIDIA) 28 14

General alpha-beta pruning Consider a node n somewhere in the tree If player has a better choice at Parent node of n Or any choice point further up n will never be reached in actual play. Hence when enough is known about n, it can be pruned. TLo (IRIDIA) 29 Final Comments about Alpha- Beta Pruning Pruning does not affect final results Entire subtrees can be pruned. Good move ordering improves effectiveness of pruning With perfect ordering, time complexity is O(b m/2 ) Branching factor of sqrt(b)!! Alpha-beta pruning can look twice as far as minimax in the same amount of time Repeated states are again possible. Store them in memory = transposition table TLo (IRIDIA) 30 15

Games of imperfect information Minimax and alpha-beta pruning require to much leaf-node evaluations. May be impractical within a reasonable amount of time. SHANNON (1950): Cut off search earlier (replace TERMINAL-TEST by CUTOFF-TEST) Apply heuristic evaluation function EVAL (replacing utility function of alpha-beta) TLo (IRIDIA) 31 Cutting off search Change: if TERMINAL-TEST(state) then return UTILITY(state) into if CUTOFF-TEST(state,depth) then return EVAL(state) Introduces a fixed-depth limit depth Is selected in so that the amount of time will not exceed what the rules of the game allow. When cuttoff occurs, the evaluation is performed. TLo (IRIDIA) 32 16

Heuristic EVAL Idea: produce an estimate of the expected utility of the game from a given position. Performance depends on quality of EVAL. Requirements: EVAL should order terminal-nodes in the same way as UTILITY. Computation may not take too long. For non-terminal states the EVAL should be strongly correlated with the actual chance of winning. Only useful for quienscent (no wild swings in value in near future) states TLo (IRIDIA) 33 Heuristic EVAL example Eval(s) = w 1 f 1 (s) + w 2 f 2 (s) + + w n f n (s) TLo (IRIDIA) 34 17

Heuristic EVAL example Eval(s) = w 1 f 1 (s) + w 2 f 2 (s) + + w n f n (s) Addition assumes independence TLo (IRIDIA) 35 Heuristic difficulties Heuristic counts pieces won TLo (IRIDIA) 36 18

Horizon effect Fixed depth search thinks it can avoid the queening move TLo (IRIDIA) 37 Games that include chance Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16) and (5-11,11-16) TLo (IRIDIA) 38 19

Games that include chance chance nodes Possible moves (5-10,5-11), (5-11,19-24),(5-10,10-16) and (5-11,11-16) [1,1], [6,6] chance 1/36, all other chance 1/18 TLo (IRIDIA) 39 Games that include chance [1,1], [6,6] chance 1/36, all other chance 1/18 Can not calculate definite minimax value, only expected value TLo (IRIDIA) 40 20

Expected minimax value EXPECTED-MINIMAX-VALUE(n)= UTILITY(n) If n is a terminal max s successors(n) MINIMAX-VALUE(s) If n is a max node min s successors(n) MINIMAX-VALUE(s) If n is a max node s successors(n) P(s). EXPECTEDMINIMAX(s) If n is a chance node These equations can be backed-up recursively all the way to the root of the game tree. TLo (IRIDIA) 41 Position evaluation with chance nodes Left, A1 wins Right A2 wins Outcome of evaluation function may not change when values are scaled differently. Behavior is preserved only by a positive linear transformation of EVAL. TLo (IRIDIA) 42 21

Discussion Examine section on state-of-the-art games yourself Minimax assumes right tree is better than left, yet Return probability distribution over possible values Yet expensive calculation TLo (IRIDIA) 43 Discussion Utility of node expansion Only expand those nodes which lead to significanlty better moves Both suggestions require meta-reasoning TLo (IRIDIA) 44 22

Summary Games are fun (and dangerous) They illustrate several important points about AI Perfection is unattainable -> approximation Good idea what to think about Uncertainty constrains the assignment of values to states Games are to AI as grand prix racing is to automobile design. TLo (IRIDIA) 45 23