Solving Problems by Searching: Adversarial Search

Similar documents
Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

Game Playing. Philipp Koehn. 29 September 2015

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

ADVERSARIAL SEARCH 5.1 GAMES

CS 331: Artificial Intelligence Adversarial Search II. Outline

Games vs. search problems. Adversarial Search. Types of games. Outline

ADVERSARIAL SEARCH. Chapter 5

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Artificial Intelligence 1: game playing

Artificial Intelligence Adversarial Search

Adversarial Search and Game Playing

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

Game playing. Chapter 6. Chapter 6 1

Game-Playing & Adversarial Search

Artificial Intelligence

School of EECS Washington State University. Artificial Intelligence

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

Foundations of Artificial Intelligence

Game playing. Chapter 6. Chapter 6 1

Games and Adversarial Search

ARTIFICIAL INTELLIGENCE (CS 370D)

Foundations of Artificial Intelligence

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 4: Search 3.

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

Ar#ficial)Intelligence!!

Lecture 5: Game Playing (Adversarial Search)

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

Game playing. Outline

Adversarial Search Aka Games

Game playing. Chapter 5, Sections 1 6

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Game Playing: Adversarial Search. Chapter 5

CS 188: Artificial Intelligence

CS 380: ARTIFICIAL INTELLIGENCE

Artificial Intelligence

Artificial Intelligence. Minimax and alpha-beta pruning

Adversarial Search: Game Playing. Reading: Chapter

Game playing. Chapter 5. Chapter 5 1

Programming Project 1: Pacman (Due )

Adversarial Search Lecture 7

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

More Adversarial Search

Artificial Intelligence. Topic 5. Game playing

Adversarial Search (Game Playing)

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

CS 771 Artificial Intelligence. Adversarial Search

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Adversarial search (game playing)

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Path Planning as Search

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Game Playing. Dr. Richard J. Povinelli. Page 1. rev 1.1, 9/14/2003

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

COMP219: Artificial Intelligence. Lecture 13: Game Playing

Adversarial Search (a.k.a. Game Playing)

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

Pengju

ADVERSARIAL SEARCH 5.1 GAMES

Announcements. Homework 1 solutions posted. Test in 2 weeks (27 th ) -Covers up to and including HW2 (informed search)

Games CSE 473. Kasparov Vs. Deep Junior August 2, 2003 Match ends in a 3 / 3 tie!

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

CS 331: Artificial Intelligence Adversarial Search. Games we will consider

Games we will consider. CS 331: Artificial Intelligence Adversarial Search. What makes games hard? Formal Definition of a Game.

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

CS 4700: Foundations of Artificial Intelligence

2 person perfect information

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

Foundations of AI. 6. Board Games. Search Strategies for Games, Games with Chance, State of the Art

Adversary Search. Ref: Chapter 5

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

CS 188: Artificial Intelligence Spring Announcements

Multiple Agents. Why can t we all just get along? (Rodney King)

Adversarial Search. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016 Tim French

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Game Engineering CS F-24 Board / Strategy Games

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

Games (adversarial search problems)

CS 188: Artificial Intelligence Spring Game Playing in Practice

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

CSE 40171: Artificial Intelligence. Adversarial Search: Game Trees, Alpha-Beta Pruning; Imperfect Decisions

5.4 Imperfect, Real-Time Decisions

CSC242: Intro to AI. Lecture 8. Tuesday, February 26, 13

CS 4700: Artificial Intelligence

DIT411/TIN175, Artificial Intelligence. Peter Ljunglöf. 2 February, 2018

Game Playing Beyond Minimax. Game Playing Summary So Far. Game Playing Improving Efficiency. Game Playing Minimax using DFS.

CS 5522: Artificial Intelligence II

V. Adamchik Data Structures. Game Trees. Lecture 1. Apr. 05, Plan: 1. Introduction. 2. Game of NIM. 3. Minimax

Transcription:

Course 440 : Introduction To rtificial Intelligence Lecture 5 Solving Problems by Searching: dversarial Search bdeslam Boularias Friday, October 7, 2016 1 / 24

Outline We examine the problems that arise when we make decisions in a world where other agents are also acting, possibly against us. 1 The minimax algorithm 2 lpha-beta pruning 3 Imperfect fast decisions 4 Stochastic games 5 Partially observable games 2 / 24

Games Multiagent environments are environments where more than one agent is acting, simultaneously or at different times. Contingency plans are necessary to account for the unpredictability of other agents. Each agent has its own personal utility function. The corresponding decision-making problem is called a game. game is competitive if the utilities of different agents are maximized in different states. In zero-sum games, the sum of the utilities of all agents is constant. Zero-sum games are purely competitive. 3 / 24

Games The abstract nature of games, such as chess, makes them appealing to study in I. The state of a game is easy to represent and agents typically have a small number of actions to choose from. (a) (b) Left : Computer chess pioneers Herbert Simon and llen Newell (1958). Right : John McCarthy and the Kotok-McCarthy program on an IBM 7090 (1967) 4 / 24

Games The abstract nature of games, such as chess, makes them appealing to study in I. The state of a game is easy to represent and agents typically have a small number of actions to choose from. Physical games, such as soccer, are more difficult to study due to their continuous state and action spaces. Robot soccer (from ri.cmu.edu) 5 / 24

Games game is described by : S 0 : the initial state (how is the game set up at the start?). PLYER(s) : Indicates which player has the move in state s (whose turn is it?). CTIONS(s) : Set of legal actions in state s. RESULT(s, a) : returns the next state after we play action a in state s. TERMINL-TEST(s) : Indicates if s is a terminal state. UTILITY(s, p) : (also called objective or payoff function) defines a numerical value for a game that ends in terminal state s for player p 6 / 24

Example : tic-tac-toe We suppose there are two players in a zero-sum game We call our player M, she tries to maximize our utility. We call our opponent MIN, she tries to minimize our utility (i.e. maximize her utility). M () MIN (O) M () O O O... MIN (O) O O O............... TERMINL O O O O O O O O O O... Utility 1 0 +1 7 / 24

Optimal games M 3 a 1 a 2 a 3 MIN 3 B 2 C 2 D b 1 b 2 b 3 c 1 c 2 c 3 d 1 d 2 d 3 3 12 8 2 4 6 14 5 2 Indicates states where M should play. Indicates states where MIN should play. Which action among {a 1, a 2, a 3 } should M play? 8 / 24

Minimax strategy M 3 a 1 a 2 a 3 MIN 3 B 2 C 2 D b 1 b 2 b 3 c 1 c 2 c 3 d 1 d 2 d 3 3 12 8 2 4 6 14 5 2 We don t take any risk, we assume that MIN will play optimally. We look for the best action for the worst possible scenario. What if our opponent is not optimal? Can we learn the opponent s behaviour? What if our player is trying to fool us by behaving in a certain way? 9 / 24

Minimax strategy M 3 a 1 a 2 a 3 MIN 3 B 2 C 2 D b 1 b 2 b 3 c 1 c 2 c 3 d 1 d 2 d 3 3 12 8 2 4 6 14 5 2 { UTILITY(s) if TERMINL-TEST(s) = MINIM(s) = max a ctions(s) MINIM(RESULT(s, a)) if PLYER(s) = M, min a ctions(s) MINIM(RESULT(s, a)) if PLYER(s) = MIN. 10 / 24

Minimax algorithm function MINIM-DECISION(state) returns an action return arg max a CTIONS(s) MIN-VLUE(RESULT(state, a)) function M-VLUE(state) returns a utility value if TERMINL-TEST(state) then return UTILITY(state) v for each a in CTIONS(state) do v M(v, MIN-VLUE(RESULT(s, a))) return v function MIN-VLUE(state) returns a utility value if TERMINL-TEST(state) then return UTILITY(state) v for each a in CTIONS(state) do v MIN(v, M-VLUE(RESULT(s, a))) return v 11 / 24

Minimax strategy in multiplayer games to move (1, 2, 6) B (1, 2, 6) (1, 5, 2) C (1, 2, 6) (6, 1, 2) (1, 5, 2) (5, 4, 5) (1, 2, 6) (4, 2, 3) (6, 1, 2) (7, 4,1) (5,1,1) (1, 5, 2) (7, 7,1) (5, 4, 5) We have three players, B, and C. The utilities are represented by a 3-dimensional vector (v, v B, v C ). We apply the same principle : assume that every player is optimal. If the game is not zero-sum, implicit collaborations may occur. 12 / 24

lpha-beta pruning Time is a major issue in game search trees. Searching the complete tree takes O(b m ) operations, where b is the branching factor and m is the depth of the tree (the horizon). Do we really need to parse the whole tree to find a minimax strategy? M 3 a 1 a 2 a 3 MIN 3 B 2 C 2 D b 1 b 2 b 3 c 1 c 2 c 3 d 1 d 2 d 3 3 12 8 2 4 6 14 5 2 13 / 24

lpha-beta pruning Example ssume that the search tree has been parsed except for actions c 2 and c 3. Let us denote the utilities of c 2 and c 3 by x and y respectively. MINIM(root) = max(min(3, 12, 8), min(2, x, y), min(14, 5, 2)) = max(3, min(2, x, y), 2) = max(3, z, 2) where z = min(2, x, y) 2 = 3. M 3 a 1 a 2 a 3 MIN 3 B 2 C 2 D b 1 b 2 b 3 c 1 c 2 c 3 d 1 d 2 d 3 3 12 8 2 4 6 14 5 2 14 / 24

lpha-beta pruning (a) [, + ] (b) [, + ] [, 3] B [, 3] B 3 3 12 (c) [3, + ] (d) [3, + ] [3, 3] B [3, 3] B [, 2] C 3 12 8 3 12 8 2 (e) [3, 14] (f) [3, 3] [3, 3] [, 2] [, 14] B C D [3, 3] [, 2] [2, 2] B C D 3 12 8 2 14 3 12 8 2 14 5 2 15 / 24

Imperfect fast decisions The minimax algorithm generates the entire search tree. The alpha-beta algorithm allows us to prune large parts of the search tree, but its complexity is still exponential in the branching factor (number of actions). This is still non-practical because moves should be made very quickly. Cutting-off the search cutoff test is used to decide when to stop looking further. heuristic evaluation function is used to estimate the utility where the search is cut off. { H-MINIM(s, d) = EVL(s) max a ctions(s) H-MINIM(RESULT(s, a), d + 1) min a ctions(s) H-MINIM(RESULT(s, a), d + 1) if CUTOFF-TEST(s, d) = true, if PLYER(s) = M, if PLYER(s) = MIN. 16 / 24

Evaluation functions Human chess players have ways of judging the value of a position without imagining all the moves ahead until a check-mate. good evaluation function should order the actions correctly according to their true utilities. Evaluation functions should be computed very quickly. 17 / 24

Example : evaluation functions in chess Evaluation functions use features of given positions in the game. Example : number of pawns in the given position. If we know from experience that 72% of positions two pawns vs one pawn lead to a win (utility +1) ; 20% to a loss (0), and 8% to a draw (1/2), then the expected value of these positions is 0.76. Other functions such as the advantage in each piece, good pawn structure, and king safety can be used as features f i. The evaluation function can be given using a weighted linear model : EVL(s) = n w i f i (s), i=1 where w i is the importance of feature f i. 18 / 24

Example : evaluation functions in chess Linear models assume that the features are independent, which is not always true (bishops are more efficient at endgame). Values of some features do not increase linearly (two knights are way more useful than one knight). (a) White to move (b) White to move In the two positions, the two players have the same number of pieces. The position on the right is much worse than the one on the left for Black. What if the search cutoff happens in the left? 19 / 24

Stochastic games In some games, the state of the game changes randomly depending on the selected actions Backgammon is a typical game that combines luck and skill. 0 1 2 3 4 5 6 7 8 9 10 11 12 25 24 23 22 21 20 19 18 17 16 15 14 13 20 / 24

Stochastic games We can use the same minimax strategy, but we need to take the randomness into account by computing the expected utilities. M CHNCE MIN B............ 1/36 1,1... 1/18 1,2......... 1/18 1/36 6,5 6,6... CHNCE C............ M 1/36 1,1... 1/18 1,2... 1/18 1/36 6,5 6,6...... TERMINL 2 1 1 1 1 21 / 24

Stochastic games We can use the same minimax strategy, but we need to take the randomness into account by computing the expected utilities. EPECMINIM(s) = { UTILITY(s) max a ctions(s) EPECMINIM(RESULT(s, a)) min a ctions(s) EPECMINIM(RESULT(s, a)) r P (r)epecminim(result(s, r)) if TERMINL-TEST(s) = true, if PLYER(s) = M, if PLYER(s) = MIN if PLYER(s) = CHNCE, where r is a chance event (e.g, dice roll). 22 / 24

Trouble with evaluation functions in stochastic games M a 1 a 2 a 1 a 2 CHNCE 2.1 1.3.9.1.9.1 21 40.9.9.1.9.1 MIN 2 3 1 4 20 30 1 400 2 2 3 3 1 1 4 4 20 20 30 30 1 1 400 400 23 / 24

Partially observable In some games, the state of the game is not fully known. Cards and Kriegspiel are examples of such games. The state of the game can be tracked by remembering past actions and observations. 4 3 2 1 a b c d Kc3? OK Illegal Rc3? OK Check 24 / 24