Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Similar documents
Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

Foundations of AI. 6. Board Games. Search Strategies for Games, Games with Chance, State of the Art

Foundations of Artificial Intelligence

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

Foundations of Artificial Intelligence

CS 331: Artificial Intelligence Adversarial Search II. Outline

Ar#ficial)Intelligence!!

Adversarial Search Aka Games

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Artificial Intelligence. Minimax and alpha-beta pruning

Game Playing: Adversarial Search. Chapter 5

Adversarial Search and Game Playing

Adversarial Search: Game Playing. Reading: Chapter

Games and Adversarial Search

CS 188: Artificial Intelligence

More Adversarial Search

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

CS 4700: Foundations of Artificial Intelligence

Adversarial Search (Game Playing)

Artificial Intelligence. Topic 5. Game playing

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

CPS331 Lecture: Search in Games last revised 2/16/10

Game Playing AI Class 8 Ch , 5.4.1, 5.5

Game-Playing & Adversarial Search

School of EECS Washington State University. Artificial Intelligence

Game Playing. Philipp Koehn. 29 September 2015

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Adversarial Search. CMPSCI 383 September 29, 2011

Adversary Search. Ref: Chapter 5

Programming Project 1: Pacman (Due )

Adverserial Search Chapter 5 minmax algorithm alpha-beta pruning TDDC17. Problems. Why Board Games?

Adversarial search (game playing)

CS 771 Artificial Intelligence. Adversarial Search

ADVERSARIAL SEARCH. Chapter 5

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Artificial Intelligence 1: game playing

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Artificial Intelligence Search III

Artificial Intelligence Adversarial Search

Games CSE 473. Kasparov Vs. Deep Junior August 2, 2003 Match ends in a 3 / 3 tie!

6. Games. COMP9414/ 9814/ 3411: Artificial Intelligence. Outline. Mechanical Turk. Origins. origins. motivation. minimax search

Artificial Intelligence

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

Game playing. Chapter 6. Chapter 6 1

Adversarial Search Lecture 7

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

CS 188: Artificial Intelligence Spring Announcements

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

Lecture 5: Game Playing (Adversarial Search)

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

2 person perfect information

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

Game playing. Chapter 6. Chapter 6 1

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016 Tim French

Adversarial Search. Chapter 5. Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro) 1

ARTIFICIAL INTELLIGENCE (CS 370D)

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Game playing. Outline

CS 380: ARTIFICIAL INTELLIGENCE

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

COMP219: Artificial Intelligence. Lecture 13: Game Playing

Game Playing. Dr. Richard J. Povinelli. Page 1. rev 1.1, 9/14/2003

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Game Engineering CS F-24 Board / Strategy Games

Game playing. Chapter 5. Chapter 5 1

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

CS 188: Artificial Intelligence Spring Game Playing in Practice

Artificial Intelligence

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Game Playing AI. Dr. Baldassano Yu s Elite Education

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

Game-playing AIs: Games and Adversarial Search I AIMA

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

CS 188: Artificial Intelligence Spring 2007

CS 5522: Artificial Intelligence II

Artificial Intelligence

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

Intuition Mini-Max 2

Game Playing. Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial.

Ch.4 AI and Games. Hantao Zhang. The University of Iowa Department of Computer Science. hzhang/c145

Game playing. Chapter 5, Sections 1 6

Game Playing State-of-the-Art

2/5/17 ADVERSARIAL SEARCH. Today. Introduce adversarial games Minimax as an optimal strategy Alpha-beta pruning Real-time decision making

Chapter 6. Overview. Why study games? State of the art. Game playing State of the art and resources Framework

Adversarial Search. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

Games vs. search problems. Adversarial Search. Types of games. Outline

CS 188: Artificial Intelligence. Overview

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Game Playing Beyond Minimax. Game Playing Summary So Far. Game Playing Improving Efficiency. Game Playing Minimax using DFS.

Adversarial Search 1

Transcription:

Foundations of AI 6. Adversarial Search Search Strategies for Games, Games with Chance, State of the Art Wolfram Burgard & Bernhard Nebel

Contents Game Theory Board Games Minimax Search Alpha-Beta Search Games with an Element of Chance State of the Art

Games & Game Theory When there is more than one agent, the future is not anymore easily predictable for the agent In competitive environments (when there are conflicting goals), adversarial search becomes necessary Mathematical game theory gives the theoretical framework (even for non-competitive environments) In AI, we usually consider only a special type of games, namely, board games, which can be characterized in game theory terminology as Extensive, deterministic, two-player, zero-sum games with perfect information

Why Board Games? Board games are one of the oldest branches of AI (Shannon, Turing, Wiener, and Shanon 1950). Board games present a very abstract and pure form of competition between two opponents and clearly require a form on intelligence. The states of a game are easy to represent. The possible actions of the players are well defined. Realization of the game as a search problem The world states are fully accessible It is nonetheless a contingency problem, because the characteristics of the opponent are not known in advance. Note: Nowadays, we also consider sport games

Problems Board games are not only difficult because they are contingency problems, but also because the search trees can become astronomically large. Examples: Chess: On average 35 possible actions from every position, 100 possible moves 35 100 nodes in the search tree (with only approx. 10 40 legal chess positions). Go: On average 200 possible actions with approx. 300 moves 200 300 nodes.

What are Our Goals? Good game programs try to look ahead as many moves as possible delete irrelevant branches of the game tree use good evaluation functions in order to estimate how good a position is

Terminology of Two-Person Board Games Players are MAX and MIN, where MAX begins. Initial position, e.g. board arrangement Operators are the legal moves Termination test determines when the game is over. Terminal state = game over. Utility function computes the value of a terminal state, e.g., -1, 0, or 1. Strategy. In contrast to regular searches, where a path from beginning to end is simply a solution, MAX must come up with a strategy to reach a terminal state regardless of what MIN does correct reactions to all of MIN s moves.

Tic-Tac-Toe Example Every step of the search tree, also called game tree, is given the player s name whose turn it is (MAX- and MIN-steps). When it is possible, as it is here, to produce the full game tree, the minimax algorithm computes an optimal strategy for MAX.

Minimax 1. Generate the complete game tree using depth-first search. 2. Apply the utility function to each terminal state. 3. Beginning with the terminal states, determine the utility of the predecessor nodes as follows: Node is a MIN-node Value is the minimum of the successor nodes Node is a MAX-node Value is the maximum of the successor nodes From the initial state (root of the game tree), MAX chooses the move that leads to the highest value (minimax decision). Note: Minimax assumes that MIN plays perfectly. Every weakness (i.e. every mistake MIN makes) can only improve the result for MAX. Note: Human strategy may be different trying to exploit the weakness of the opponent.

Minimax Example

Minimax Algorithm Recursively calculates the best move from the initial state.

Evaluation Function When the search space is too large, the game tree can be created to a certain depth only. The art is to correctly evaluate the playing position of the leaves, which are not terminal states. Example of simple evaluation criteria in chess: Material worth: pawn=1, knight =3, rook=5, queen=9. Other: king safety, good pawn structure Rule of thumb: 3-point advantage = certain victory The choice of evaluation function is decisive! The value assigned to a state of play should reflect the chances of winning, i.e. the chance of winning with a 1-point advantage should be less than with a 3-point advantage.

Evaluation Function - General The preferred evaluation functions are weighted, linear functions (easy to compute): w 1 f 1 + w 2 f 2 + + w n f n where the w s are the weights, and the f s are the features. [e.g. w 1 = 3, f1 = number of our own knights on the board] Assumption: The criteria are independent. The weights can be learned. The criteria, however, must be given (no one knows how they can be learned).

Cutting Off Search Fixed-depth search (so the goal limit is not overstepped) Better: iterative deepening search (with cutoff at time limit) but only evaluate quiescent positions that won t cause large fluctuations in the evaluation function in the following moves. if bad situations can be pushed behind the horizon, try to search in order to find out

Two Similar Positions Very similar positions, but in (b) black will lose Search for quiescent position

Horizon Problem Black has a slight material advantage but will eventually lose (pawn becomes a queen) A fixed-depth search (<14) will not detect this because it thinks it can avoid it (on the other side of the horizon - because black is concentrating on the check with the rook, to which white must react).

Pruning Branches Often, it becomes clear early on that a branch cannot lead to better results than the one we have explored already Prune away such branches that cannot improve our results! What are the conditions under which we are allowed to do that?

Pruning Irrelevant Branches

Pruning Branches: General Idea If m > n we will never reach node n in the game. Once we have enough information (an upper bound) about the node n, we can prune

Alpha-Beta Pruning: The Method = the value of the best (i.e., highest value) choice we have found so far at any choice point along the path for MAX In the example: m = the value of the best (i.e., lowest value) choice we have found so far at any choice point along the path for MIN

When Can We Prune? The following applies: α values of MAX nodes can never decrease β values of MIN nodes can never increase (1) Prune below the MIN node whose β-bound is less than or equal to the α-bound of its MAXpredecessor node. (2) Prune below the MAX node whose α-bound is greater than or equal to the β-bound of its MINpredecessor node. Delivers results that are just as good as with complete minimax searches to the same depth (because only irrelevant nodes are eliminated).

Alpha-Beta Search Algorithm Initial call with MAX-VALUE(initial-state,game,,+ )

Alpha-Beta Trace α = -, β = + α = - β = +, α= β = α = β =

Efficiency Gain The alpha-beta search cuts the largest amount off the tree when we examine the best move first. In the best case (always the best move first), the search cost is reduced to O(b d/2 ). In the average case (randomly distributed moves), the search cost is reduced to O((b/log b) d ) For b < 100, we get O(b 3d/4 ). Practical case: A simple ordering heuristic brings the performance close to the best case. i.e. we can search twice as deep in the same amount of time

Transposition Tables As in search trees, also in game trees there is the problem of repeated states In chess, e.g. the game tree may have 35 100 nodes, but there are only 10 40 different board positions Similar to closed list in search, maintain a transposition table Got its name from the fact that the same state is reached by a transposition of moves.

Games that Include an Element of Chance White has just rolled 6-5 and has 4 legal moves.

Game Tree for Games with an Element of Chance In addition to MIN- and MAX nodes, we need chance nodes (for rolling the dice).

Calculation of the Expected Value Expectiminimax instead of Minimax: Expectiminimax(n) = Utility(n) max s Successors(n) Expectiminimax(s) min s Successors(n) Expectiminimax(s) if n is a terminal state if n is a MAX node if n is a MIN node s Successors(n) P(s) Expectiminimax(s) if n is a chance node

Problems Order-preserving transformations on evaluation values change the best move: Search costs increase: Instead of O(b d ), we get O(bxn) d, where n is the number of possible dice outcomes. In Backgammon (n = 21, b = 20 but can be 4000) the maximum d is 3. Variation of alpha-beta search can be used

Card Games Recently card games such as bridge and poker have been addressed as well One approach: simulate play with open cards and then average over all possible plays (or make a Monte Carlo simulation) Averaging over clairvoyancy Although incorrect, seems to give reasonable results

State of the Art (1) Checkers, draughts (by international rules): A program called CHINOOK is the official world champion in man-computer competition (acknowledged by ACF and EDA) and the highestrated Backgammon: The BKG program defeated the official world champion in 1980. A newer program called TD- Gammon (that used a reinforcement learning to learn the evaluation function) is among the top 3 players. Othello: Very good, even on normal computers. Programs are not allowed at tournaments. Logistello defeated the world champion in 1997 the human world champion.

State of the Art (2) Bridge: The Bridge Baron program won the 1997 computer bridge championship. GIB (using the averaging over clairvoyancy) won in 2000. In general, they are not a match for humans, though Tic-Tac-Toe, Go-Moku (five in a row), Nine-Men s Morris are all solved by exhaustive analysis. Go: The best programs play a little better than beginners (10 kyu) (branching factor on average 200). There is a $2 Mi. US-$ prize for the first program to defeat a world master.

Chess (1) Chess as Drosophila of AI research. A limited number of rules produces an virtually unlimited number of courses of play. In a game of 40 moves, there are 1.5 x 10 128 possible courses of play. Victory comes through logic, intuition, creativity, and previous experience. In 1997, the world chess master G. Kasparow was beaten by Deep Blue in a match of 6 games. Recently, Kasparow played a draw against Deep Junior

Chess (2) Deep Blue (IBM Thomas J. Watson Research Center) Special hardware (32 processors with 8 chips, 2 Mi. calculations per second) Heuristic search Case-based reasoning and learning techniques 1996 Knowledge based on 600 000 chess games 1997 Knowledge based on 2 million chess games Training through grand masters

Chess (3) Kasparow: There were moments when I had the feeling that these boxes are possibly closer to intelligence than we are ready to admit. From a certain point on it seems, in chess at least, that great quantity translates into quality. I see rather a great chance for fine creativity and brute force computational capacity to complement each other in a new form of information acquisition. The human and electronic brain together would produce a new quality of intelligence an intelligence worthy of this name.

The Reasons for Success Alpha-Beta-Search with dynamic decision/making for uncertain positions Good (but usually simple) evaluation functions Large databases of opening moves. Very large end-game databases (for checkers, all 8-piece situations) And very fast and parallel processors!

Summary A game can be defined by the initial state, the operators (legal moves), a termination test and a utility function (outcome of the game). In two-player games, the minimax algorithm can determine the best move by enumerating the entire game tree. The alpha-beta algorithm produces the same result but is more efficient because it prunes away irrelevant branches. Usually, it is not feasible to construct the complete game tree, so the utility of some states must be determined by an evaluation function. Games of chance can be handled by an extension of the alpha-beta algorithm.