CSE : Python Programming

Similar documents
mywbut.com Two agent games : alpha beta pruning

game tree complete all possible moves

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

CMPUT 396 Tic-Tac-Toe Game

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

Playing Games. Henry Z. Lo. June 23, We consider writing AI to play games with the following properties:

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

CS188 Spring 2010 Section 3: Game Trees

Announcements. Homework 1 solutions posted. Test in 2 weeks (27 th ) -Covers up to and including HW2 (informed search)

MITOCW watch?v=6fyk-3vt4fe

CS61B Lecture #22. Today: Backtracking searches, game trees (DSIJ, Section 6.5) Last modified: Mon Oct 17 20:55: CS61B: Lecture #22 1

MITOCW R22. Dynamic Programming: Dance Dance Revolution

CS188 Spring 2010 Section 3: Game Trees

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

Applications of Artificial Intelligence and Machine Learning in Othello TJHSST Computer Systems Lab

CSC 380 Final Presentation. Connect 4 David Alligood, Scott Swiger, Jo Van Voorhis

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

Adversary Search. Ref: Chapter 5

Programming Project 1: Pacman (Due )

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

MITOCW Project: Backgammon tutor MIT Multicore Programming Primer, IAP 2007

CS188 Spring 2014 Section 3: Games

Game Playing Part 1 Minimax Search

Adversarial Search 1

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

Artificial Intelligence. Minimax and alpha-beta pruning

For slightly more detailed instructions on how to play, visit:

ARTIFICIAL INTELLIGENCE (CS 370D)

Board Game AIs. With a Focus on Othello. Julian Panetta March 3, 2010

Artificial Intelligence Lecture 3

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

CSE 332: Data Structures and Parallelism Games, Minimax, and Alpha-Beta Pruning. Playing Games. X s Turn. O s Turn. X s Turn.

Game Playing State-of-the-Art

Games (adversarial search problems)

MITOCW watch?v=fp7usgx_cvm

Adversarial Search (Game Playing)

Foundations of Artificial Intelligence

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Block Sanding Primer Dos and Don ts Transcript

Game-playing AIs: Games and Adversarial Search I AIMA

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

CS 5522: Artificial Intelligence II

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

CSE 473: Artificial Intelligence Fall Outline. Types of Games. Deterministic Games. Previously: Single-Agent Trees. Previously: Value of a State

Homework Assignment #2

4. Games and search. Lecture Artificial Intelligence (4ov / 8op)

Whereupon Seymour Pavitt wrote a rebuttal to Dreyfus' famous paper, which had a subject heading, "Dreyfus

CS 771 Artificial Intelligence. Adversarial Search

CS 188: Artificial Intelligence Spring Announcements

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

CSC 396 : Introduction to Artificial Intelligence

Chess Algorithms Theory and Practice. Rune Djurhuus Chess Grandmaster / September 23, 2013

1 Modified Othello. Assignment 2. Total marks: 100. Out: February 10 Due: March 5 at 14:30

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

V. Adamchik Data Structures. Game Trees. Lecture 1. Apr. 05, Plan: 1. Introduction. 2. Game of NIM. 3. Minimax

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Generalized Game Trees

Adversarial Search. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

MITOCW R9. Rolling Hashes, Amortized Analysis

CS 188: Artificial Intelligence

Artificial Intelligence

Programming an Othello AI Michael An (man4), Evan Liang (liange)

Game-Playing & Adversarial Search

2 person perfect information

Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! Due (in dropbox) Tuesday, September 23, 9:34am

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

CS61B Lecture #33. Today: Backtracking searches, game trees (DSIJ, Section 6.5)

Random. Bart Massey Portland State University Open Source Bridge Conf. June 2014

INF September 25, The deadline is postponed to Tuesday, October 3

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

Game Playing AI Class 8 Ch , 5.4.1, 5.5

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Game playing. Chapter 5. Chapter 5 1

Instability of Scoring Heuristic In games with value exchange, the heuristics are very bumpy Make smoothing assumptions search for "quiesence"

Adversarial Search Aka Games

Game Engineering CS F-24 Board / Strategy Games

Game Playing Beyond Minimax. Game Playing Summary So Far. Game Playing Improving Efficiency. Game Playing Minimax using DFS.

CPS331 Lecture: Search in Games last revised 2/16/10

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

MITOCW R11. Principles of Algorithm Design

More Adversarial Search

CSE 40171: Artificial Intelligence. Adversarial Search: Game Trees, Alpha-Beta Pruning; Imperfect Decisions

CSC384: Introduction to Artificial Intelligence. Game Tree Search

Games and Adversarial Search II

MITOCW MITCMS_608S14_ses03_2

Monte Carlo Tree Search. Simon M. Lucas

More on games (Ch )

MITOCW Mega-R4. Neural Nets

Data Structures and Algorithms

Stitching Panoramas using the GIMP

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Midterm Examination. CSCI 561: Artificial Intelligence

CS 229 Final Project: Using Reinforcement Learning to Play Othello

A Complex Systems Introduction to Go

2048: An Autonomous Solver

More on games (Ch )

MITOCW 23. Computational Complexity

Transcription:

CSE 399-004: Python Programming Lecture 3.5: Alpha-beta Pruning January 22, 2007 http://www.seas.upenn.edu/~cse39904/

Slides mostly as shown in lecture

Scoring an Othello board and AIs A simple way to "score" an Othello board: number of white pieces number of black pieces The white player wants to maximize this number The black player wants to minimize this number An AI for each side is either trying to maximize or minimize that number for the final board position 3

Minimax But there's no way to predict what the final board position will be So white could choose moves such that the minimum score at any potential final board is as great as possible This gives rise to so-called minimax algorithms 4

Drawing games as trees We can draw two-player games as trees Each node in the tree represents a "position", for example the board in an Othello game At each position, it is one of the players turns An arrow points from position X to position Y if the player's move at position X results in Y 5

current position (your move) 45 47 64 50 33 41 20 37 52

bad move! you'll end at 41 current position (your move) 45 47 64 50 33 41 20 37 52

bad move! you'll end at 37 current position (your move) 45 47 64 50 33 41 20 37 52

best move! you'll end at 45 current position (your move) 45 47 64 50 33 41 20 37 52

best move! you'll end at 45 current position (your move) 45 47 64 50 33 41 20 37 52 we've maximized the minimum score we can get

Computation is expensive Looking ahead to the end of the game is usually not feasable: too much computation involved Cheap solution Compute the score for the board after each possible move you can make Choose the move that gives you the highest score Problem: What looks good here might look really bad after a few more moves 10

Alpha-beta pruning We can't scan the entire tree The cheap solution is too cheap Intermediate solution: Alpha-beta pruning Looks out some number of moves (called the "depth") Attempts to avoid examining moves which are obviously incredibly bad. The trick: stop examining a move if it's worse than one you've already found. 11

A note about the following code The code is very similar to that found on Wikipedia: http://en.wikipedia.org/wiki/alpha-beta_pruning 12

Alpha-beta pruning: pseudocode function abp(node, depth, alpha, beta): if depth is zero: return the score of node if node has no children: return the score of the node for each child of node: alpha = max(alpha, -abp(child, depth-1, -beta, -alpha)) if alpha >= beta: return alpha return alpha alpha is the maximum minimum-score you've found so far 13

Alpha-beta pruning: pseudocode function abp(node, depth, alpha, beta): if depth is zero: return the score of node if node has no children: return the score of the node for each child of node: alpha = max(alpha, -abp(child, depth-1, -beta, -alpha)) if alpha >= beta: return alpha return alpha beta is the minimum maximum-score your opponent has found so far 14

Alpha-beta pruning: pseudocode function abp(node, depth, alpha, beta): if depth is zero: return the score of node if node has no children: return the score of the node for each child of node: alpha = max(alpha, -abp(child, depth-1, -beta, -alpha)) if alpha beta: return alpha return alpha thus, when alpha beta, give up because someone did something non-optimal 15

Alpha-beta pruning: pseudocode function abp(node, depth, alpha, beta): if depth is zero: return the score of node if node has no children: return the score of the node for each child of node: alpha = max(alpha, -abp(child, depth-1, -beta, -alpha)) if alpha beta: return alpha return alpha initially, call abp() with alpha = -infinity and beta = +infinity 16

Alpha-beta pruning: pseudocode function abp(node, depth, alpha, beta): if depth is zero: return the score of node if node has no children: return the score of the node for each child of node: alpha = max(alpha, -abp(child, depth-1, -beta, -alpha)) if alpha beta: return alpha return alpha reverse the roles of alpha and beta because this next call is from the view point of your opponent 17

Alpha-beta pruning: pseudocode function abp(node, depth, alpha, beta): if depth is zero: return the score of node if node has no children: return the score of the node for each child of node: alpha = max(alpha, -abp(child, depth-1, -beta, -alpha)) if alpha beta: return alpha return alpha for the same reason, negate the result of this call 18

Problems with the pseudocode It returns a score, which is great, but you really need to return what move gets you that score! ±infinity are for a generic version Can replace -infinity with the least score possible Can replace +infinity with the greatest score possible 19

Completely new slides (mainly for Homework 4)

Pseudocode function abp(node, depth, alpha, beta): if depth is zero: return the score of node if node has no children: return the score of node for each child of node: alpha = max(alpha, -abp(child, depth-1, -beta, -alpha)) if alpha beta: return alpha return alpha See that highlighted minus sign? Big bug I accidentally made in class. 21

Pseudocode function abp(node, depth, alpha, beta): if depth is zero: return the score of node if node has no children: return the score of node for each child of node: alpha = max(alpha, -abp(child, depth-1, -beta, -alpha)) if alpha beta: return alpha return alpha This version of the alpha-beta pruning algorithm attempts to maximize alpha It is the corrected version of what I showed in class 22

Pseudocode: Problems! function abp(node, depth, alpha, beta): if depth is zero: return the score of node if node has no children: return the score of node for each child of node: alpha = max(alpha, -abp(child, depth-1, -beta, -alpha)) if alpha beta: return alpha return alpha It doesn't return an actual move It doesn't worry about the case where one player goes twice in a row (this can happen in Othello!) 23

10 20 5 60 30 30 15 circles: our move squares: opponents move we want to maximize the minimum score we can get

tree here has depth 2 node A 10 20 5 60 30 30 15 our move, so we call: abp(a, 2, -, + )

alpha = - node A 10 20 5 60 30 30 15 we'll keep track of the current value of alpha and beta in the call to abp() on a given node

alpha = - node A node B alpha = - 10 20 5 60 30 30 15 A has children, so make a recursive call to abp: abp(b, 1, -, + )

alpha = - node B node C alpha = - 10 20 5 60 30 30 15 alpha = - B has children, so make a recursive call to abp: abp(c, 0, -, + )

alpha = - node B node C alpha = -10 10 20 5 60 30 30 15 alpha = - At C, depth is zero, so just return 10. At B, -10 is greater than -, so update alpha. (Remember that minus sign in the algorithm?)

alpha = - node B alpha = -10 10 20 5 60 30 30 15 let's grey out nodes that we've already considered

alpha = - node B alpha = -10 10 20 5 60 30 30 15 We now call abp() on the other child of B, which returns 20. At B, -20 is not better than -10. No update.

alpha = 10 node A alpha = -10 10 20 5 60 30 30 15 That call to abp(b, 1, -, + ) from before returns -10 now. At A, we update alpha to 10. (Again, remember that minus sign?)

alpha = 10 node D alpha = - beta = -10 10 20 5 60 30 30 15 Move on to the next child of A: abp(d, 1, -, -10) # note the values here!

alpha = 10 node D alpha = -5 beta = -10 10 20 5 60 30 30 15 We're at D. Call abp() on the first child of D. That returns 5. Update alpha at D.

alpha = 10 node D alpha = -5 beta = -10 10 20 5 60 30 30 15 We're at D. Fiddlesticks. alpha beta. Simply return alpha, skip the rest of D's children.

alpha = 10 node D 10 20 5 30 15 I've erased the nodes we skipped. We're done with D. 5 is not greater than 10, so no update at A.

alpha = 10 node E alpha = - beta = -10 10 20 5 30 15 A has one more child. Call abp(e, 1, -, -10).

alpha = 10 node E alpha = -30 beta = -10 10 20 5 30 15 After the first child of E.

alpha = 10 node E alpha = -15 beta = -10 10 20 5 30 15 After the second child of E.

alpha = 15 node E alpha = -15 beta = -10 10 20 5 30 15 The call to abp() on E returns -15. At A, 15 10, so update alpha. Our original call to abp(a, 2, -, + ) returns 15!

10 20 5 60 30 30 15 Looking back at the entire, original tree, this looks right.

Additional resources http://en.wikipedia.org/wiki/alpha-beta_pruning Probably overkill, as usual In fact, maybe only read this if you like really dry text But it is where I got the psuedo-code from http://www.seanet.com/~brucemo/topics/ alphabeta.htm I don't like the code example here so much The introductory text is pretty good So read the intro stuff, skip the code 42