INF September 25, The deadline is postponed to Tuesday, October 3

Similar documents
2 person perfect information

Adversary Search. Ref: Chapter 5

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

mywbut.com Two agent games : alpha beta pruning

V. Adamchik Data Structures. Game Trees. Lecture 1. Apr. 05, Plan: 1. Introduction. 2. Game of NIM. 3. Minimax

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

CMPUT 396 Tic-Tac-Toe Game

ARTIFICIAL INTELLIGENCE (CS 370D)

Playing Games. Henry Z. Lo. June 23, We consider writing AI to play games with the following properties:

game tree complete all possible moves

Games (adversarial search problems)

CPS331 Lecture: Search in Games last revised 2/16/10

Game Playing Part 1 Minimax Search

Adversarial Search 1

INF 4130 Exercise set for 3rd Oct w/solutions

Announcements. Homework 1 solutions posted. Test in 2 weeks (27 th ) -Covers up to and including HW2 (informed search)

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Adversarial Search. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

16.410/413 Principles of Autonomy and Decision Making

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

CSC 380 Final Presentation. Connect 4 David Alligood, Scott Swiger, Jo Van Voorhis

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Adversarial Search: Game Playing. Reading: Chapter

Game Playing Beyond Minimax. Game Playing Summary So Far. Game Playing Improving Efficiency. Game Playing Minimax using DFS.

Programming Project 1: Pacman (Due )

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Path Planning as Search

Game Playing AI Class 8 Ch , 5.4.1, 5.5

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

CS 4700: Artificial Intelligence

Artificial Intelligence Lecture 3

CSE 332: Data Structures and Parallelism Games, Minimax, and Alpha-Beta Pruning. Playing Games. X s Turn. O s Turn. X s Turn.

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

Aja Huang Cho Chikun David Silver Demis Hassabis. Fan Hui Geoff Hinton Lee Sedol Michael Redmond

CSC384: Introduction to Artificial Intelligence. Game Tree Search

Artificial Intelligence 1: game playing

Adversarial Search (Game Playing)

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Data Structures and Algorithms

More on games (Ch )

CS 771 Artificial Intelligence. Adversarial Search

Game Playing State-of-the-Art

! HW5 now available! ! May do in groups of two.! Review in recitation! No fancy data structures except trie!! Due Monday 11:59 pm

Ar#ficial)Intelligence!!

Adversarial Search Aka Games

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

CSCI1410 Fall 2018 Assignment 2: Adversarial Search

CS 5522: Artificial Intelligence II

Game Engineering CS F-24 Board / Strategy Games

Game Playing AI. Dr. Baldassano Yu s Elite Education

Game-playing AIs: Games and Adversarial Search I AIMA

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

More on games (Ch )

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

Artificial Intelligence. 4. Game Playing. Prof. Bojana Dalbelo Bašić Assoc. Prof. Jan Šnajder

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Game-Playing & Adversarial Search

2/5/17 ADVERSARIAL SEARCH. Today. Introduce adversarial games Minimax as an optimal strategy Alpha-beta pruning Real-time decision making

Problem 1. (15 points) Consider the so-called Cryptarithmetic problem shown below.

Artificial Intelligence Search III

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

Chess Algorithms Theory and Practice. Rune Djurhuus Chess Grandmaster / September 23, 2013

Foundations of Artificial Intelligence

Artificial Intelligence

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

CS 331: Artificial Intelligence Adversarial Search II. Outline

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

CS 4700: Foundations of Artificial Intelligence

AI Approaches to Ultimate Tic-Tac-Toe

Lecture 33: How can computation Win games against you? Chess: Mechanical Turk

Artificial Intelligence

Computer Game Programming Board Games

Artificial Intelligence. Minimax and alpha-beta pruning

CS188 Spring 2014 Section 3: Games

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

4. Games and search. Lecture Artificial Intelligence (4ov / 8op)

Game Tree Search. Generalizing Search Problems. Two-person Zero-Sum Games. Generalizing Search Problems. CSC384: Intro to Artificial Intelligence

1 Introduction. 1.1 Game play. CSC 261 Lab 4: Adversarial Search Fall Assigned: Tuesday 24 September 2013

COMP9414: Artificial Intelligence Adversarial Search

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence Spring Announcements

Rules of the game. chess checkers tic-tac-toe...

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

Programming an Othello AI Michael An (man4), Evan Liang (liange)

CSE 473: Artificial Intelligence Fall Outline. Types of Games. Deterministic Games. Previously: Single-Agent Trees. Previously: Value of a State

COMP219: Artificial Intelligence. Lecture 13: Game Playing

Adverserial Search Chapter 5 minmax algorithm alpha-beta pruning TDDC17. Problems. Why Board Games?

CS 188: Artificial Intelligence. Overview

Five-In-Row with Local Evaluation and Beam Search

CS 188: Artificial Intelligence

More Adversarial Search

Introduction to AI Techniques

Transcription:

INF 4130 September 25, 2017 New deadline for mandatory assignment 1: The deadline is postponed to Tuesday, October 3 Today: In the hope that as many as possibble will turn up to the important lecture on undicidabilty (Dino Karabeg, Monday, October 2) First hour: Ch. 23.5: Game trees and strategies for two-player games. Second hour: Rune Djurhuus: About chess-playing programs (slides will appear) Our teaching assistant is leaving us!! We have difficulties finding a new one May be one or two of you can take the job? 1

Ch. 23.5: Games, game trees and strategies We looked at «one player games» (= search) earlier, and their decision trees in Ch 23 (from start to 23.4). This is search for a goal node that everybody agrees is «good». Then you can e.g. use A*-search. One can e.g. use it for: Solve the 15-puzzle from a given position. Find the shortest path between nodes in a graph (better than plain Dijksta) BUT: When two players are playing against each other, things get very different. What is good for one player is bad for the other. The trees of possible plays are often enormous. For chess it is estimated to have 10 100 nodes, and can therefore never (?) be searched exhaustively! We only look at zero-sum games. This means: What one player gains in a move is lost by the other. The quality of a situation is represented by numbers.

Example: Tic-tac-toe and game trees The board has 3 x 3 squares. The game: Alternately do such moves: Player A chooses an unused square and writes x in it, Player B does the same, but writes o. The start node of the game tree for «tic-tactoe». When a player has three-in-arow, he/she has won. Player A (always) starts And we will here do all our considerations from A s point of view. We use numbers for node quality. High numbers are good for A and small numbers are good for B.

Number of nodes in a tic-tac-toe game tree 9 nodes 9*8 = 72 nodes 9*8*7 = 504 nodes 9*8*7*6 = 3024 nodes 9*8*7*6*5 = 15120 nodes 9*8*7*6*5*4*3*2*1 = 9! ( factorial ) = 362 880 nodes By searching depth-first in this tree, you never need to store more than 9 nodes, but it will take some time to go through all 362 880 nodes (and for interesting games there are usually a lot more!).

But if we use some type of breath first strategy This usually requires a lot of memory! 1 node 1 node 9 nodes 9*8 = 72 nodes 9*8*7 = 504 nodes 9*8*7*6 = 3024 nodes 9 nodes 72 different nodes 252 different nodes 756 different nodes 9*8*7*6*5 = 15120 nodes 1260 different nodes 9*8*7*6*5* 4 = 60480 nodes 9*8*7*6*5*4*3*2*1 = 362 880 nodes 1680 different nodes 126 different nodes BUT: In some games you can gain a lot by recognizing equal nodes, and not repeat the analysis for these (see next slide).

But if we represent each game position only once (also usable for «one player games») 1 node 1 node 9 nodes 9*8 = 72 nodes 9 nodes 72 different nodes 9*8*7 = 504 nodes 9*8*7*6 = 3024 nodes 9*8*7*6*5*4*3*2*1 = 362 880 nodes 252 different nodes 756 different nodes 9*8*7*6*5 = 15120 nodes 1260 different nodes 9*8*7*6*5* 4 = 60480 nodes Sketch of a collapsed tree (a DAG) This usually requires a lot of memory! 1680 different nodes 126 different nodes = ( ) 9 4 BUT: In some games you can gain a lot by reconizing equal nodes, and not repeat the analysis for these (this is somewhat like dynamic programming). In the above simple game we never need more than 1680 nodes during breath first search

Representing symmetric solutions by one node (also usable for «one player games») One can also gain a lot by looking at symmetries: Represent positions that are symmetries of each other only once. Tic-tac-toe: Symmetric solutions will always be at the same depth, but this is not generally the case! Using this will often reduce the memory needs even further! But in e.g. chess there are few symmetries to utilize.

Zero-sum games Seen from A, high values are good, and low ones are bad For B the opposite is true We will (for the time being) look at the values as seen form A! A strategy for A means: A rule telling A what to do in all possible Asituations. Aim: We will look for a strategy (if one exists) so that A is sure to win Fully analyzable games means: The full tree can be analyzed Then there are three possibilities for each A-situation S: 1. A has a strategy from S, so that it will win whatever B does, and chooses its move from S according to that (score: +1 for A) 2. Whatever A does from S, B has a winning strategy from the new situation (score: -1 for A). 3. If A and B both play perfectly, it will end in a tie (score: 0 for both), or it can go on for ever. Version 3 can occur only for some games. The game tic-tac-toe ends in a tie if both players play perfectly.

Another example: The game Nim The game Nim: We start with two (or more?) piles of sticks. Number of sticks: m and n. One player can take any number of sticks from one pile, but have to take at least 1. The player taking the last stick has lost. Nim will never end in a tie. With m =3 and n =2, the full game tree is shown to the right. The value seen from A is indicated for the final situations (leaf nodes). Next problem: What is the value of the rest of the nodes? Here m=3 and n=2 NB: We could reduce the number of separate nodes by recognizing symmetries and equivalent nodes (see red circles above)

How can we find a strategy so that A wins? Or prove that no such strategy exists. A wants to find an optimal move. We must assume that also B will do optimal moves seen from its point of view. Thus B will move to the subnode with smallest value (since +1 and -1 are as seen from A). Min-Max Strategy: To compute the value of a node, we have to know the values of all the subnodes. This can be done by a depth first search, computing node values during the withdrawal (postfix). Strategy for A: If possible, move to a node with value +1. Otherwise make a random move. Strategy for B: If possible, move to a node with value -1. Otherwise make a random move.

The Min-Max-Algorithm in action With simple alpha-beta cutoff (pruning) S U V W Red arrows: Good move from winning situations for A Blue arrows: Good move from winning situations for B Previous slide: This is done by a deph first traversal of the game tree, computing values on withdrawal (that is postfix) The result of this is given in the figure to the left as + and -. Possible optimalization: From the start-position S, assume that A has looked at three of its subtrees (from the left). A has then found a winning node U (marked +1). Then the value of V and W does not matter. This is a simple version of alpha-beta cutoff (pruning)

Usually, the game tree is too large to traverse One then usually searches to a certain depth, and then estimate (with some heuristic function) how good the situation is for A at the nodes at that depth. We then usually also use other numbers that -1, 0 and +1. In the figure above we go to depth 2. The heuristic function above is: Number of «winning lines for A» minus the same for B (this is given below each leaf nodes). The best move for A from the start position is therefore (according to this heuristic) to go to C 2.

Usually, the game tree is too large to traverse However, this heuristic is not good later on in the game. It does not take into account that winning is better than any heuristic. We therefore, in addition, give winning nodes the value + (and here 9 is fine). This will give quite a good strategy. But, as said above: tic-tac-toe will end in a tie if both players play perfectly. We have to add that the tie-situation (e.g. the one to the right) gets the value 0. Thus, if we fully analyze the game, the value of the root node will be 0. NOTE: The difficult choice for a game-programmer is between searching very deep or using a good, but time consuming, heuristic function! o o x x x o o x x

Alpha-beta cutoff (pruning) This technique is only usable for two-player games! Intuitively Alpha-beta-cutoff goes as follows (assuming it is A s move): A will consider all the possible moves from the current situation, one after the other A has already seen a move in which it can obtain the value u (after C 1 og C 2, u = 1) A looks at the next potantial move, which leads to situation C 3 However, A then observes that from C 3, B has a very good move (= bad for A) that seen from A has the value v = -1. Then the value of C 3 cannot be better than -1 (independent of what the other subtrees of C 3 gives (as B will minimize at C 3 ). As v < u, player A has no interest in looking for even better moves for B from situation C 3. A already knows that it has a better move than to C 3, which is to C 2. u Should have become -2, but value -1 is enough for A to conclude that a move to C 3 is not the best (to C 2 is better) C 4

Examples showing alpha-beta cutoff When A considers the next move: Cutoffs from A-situations is called alpha-cutoffs. Corresponding cutoffs from B- situations are called beta-cutoffs. The figures to the left shows alphaand beta-cutoffs at different stages of a DF-search of a game tree. When implementing alpha-beta-cutoffs during a DF-search, it is usual to switch viewpoints between the levels. Then we can always maximize the value. But we have to negate all values for each new level. Such an implementation is given at the next slide.

Alpha-beta-search (negating the values for each level) real function ABNodeValue ( X, // The node we compute alpha/beta value for. Children: C[1],C[2] C[k] numlev, // Number of levels left parentval,// The alpha-beta-value from the parent node (-LB from the parent) // Returned value: The final alpha/beta-value for the node X { real LB; // Current Lower Bound for the alpha/beta value of this node (X) if <X is a terminal node> or numlev = 0 then { return <An estimate of the quality of the situation (the heuristic)>; } else { LB := - ABNodeValue(C[1], NumLev-1, ); // Recursive call for i := 2 to k do { if LB >= parentvalue then { return LB; // Cutoff, no further calculation } else { LB := max(lb, - ABNodeValue(C[i], Numlev-1, - LB) ); // Rec call } } } } return LB; Start the recursive call to calculate value for the (current) rootnode (down to depth 10) by calling ABNodeValue(rootnode, 10, - )

Misprints in the textbook There are some simple misprints in the program at page 741 in the textbook (probably not corrected in any edition): AB is missing in the name of the procedure in the recursive call. A right parenthesis is missing at the end of the line where max is called. These errors are corrected on the previous slide!