Intuition Mini-Max 2

Similar documents
Games CSE 473. Kasparov Vs. Deep Junior August 2, 2003 Match ends in a 3 / 3 tie!

Adversarial Search. Chapter 5. Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro) 1

Adversarial Search. Chapter 5. Mausam (Based on slides of Stuart Russell, Andrew Parks, Henry Kautz, Linda Shapiro, Diane Cook) 1

Game playing. Chapter 5. Chapter 5 1

Game playing. Chapter 6. Chapter 6 1

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

Game playing. Outline

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

Game playing. Chapter 6. Chapter 6 1

Adversarial search (game playing)

Game Playing: Adversarial Search. Chapter 5

CS 380: ARTIFICIAL INTELLIGENCE

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Games vs. search problems. Adversarial Search. Types of games. Outline

Game Playing. Dr. Richard J. Povinelli. Page 1. rev 1.1, 9/14/2003

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Game playing. Chapter 5, Sections 1 6

CS 188: Artificial Intelligence Spring Game Playing in Practice

Artificial Intelligence

Artificial Intelligence Adversarial Search

Adversarial Search Lecture 7

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

Game Playing. Philipp Koehn. 29 September 2015

Artificial Intelligence. Topic 5. Game playing

CS 188: Artificial Intelligence Spring Announcements

Game-Playing & Adversarial Search

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Adversarial Search (Game Playing)

CS 771 Artificial Intelligence. Adversarial Search

Game playing. Chapter 5, Sections 1{5. AIMA Slides cstuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 1

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

CSE 473: Artificial Intelligence. Outline

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

COMP219: Artificial Intelligence. Lecture 13: Game Playing

Artificial Intelligence. Minimax and alpha-beta pruning

Adversarial Search. CMPSCI 383 September 29, 2011

Ch.4 AI and Games. Hantao Zhang. The University of Iowa Department of Computer Science. hzhang/c145

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Adversarial Search and Game Playing

CS 188: Artificial Intelligence

Artificial Intelligence, CS, Nanjing University Spring, 2018, Yang Yu. Lecture 4: Search 3.

Lecture 5: Game Playing (Adversarial Search)

CS 188: Artificial Intelligence Spring 2007

CS 331: Artificial Intelligence Adversarial Search II. Outline

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

CS 188: Artificial Intelligence. Overview

Game Playing. Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial.

CSE 573: Artificial Intelligence Autumn 2010

Game Playing AI Class 8 Ch , 5.4.1, 5.5

ADVERSARIAL SEARCH. Chapter 5

Local Search. Hill Climbing. Hill Climbing Diagram. Simulated Annealing. Simulated Annealing. Introduction to Artificial Intelligence

CSE 573: Artificial Intelligence

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Programming Project 1: Pacman (Due )

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Adversarial Search Aka Games

6. Games. COMP9414/ 9814/ 3411: Artificial Intelligence. Outline. Mechanical Turk. Origins. origins. motivation. minimax search

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

Announcements. CS 188: Artificial Intelligence Fall Local Search. Hill Climbing. Simulated Annealing. Hill Climbing Diagram

CS 4700: Foundations of Artificial Intelligence

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Game Playing State of the Art

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Artificial Intelligence

Artificial Intelligence Search III

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

School of EECS Washington State University. Artificial Intelligence

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016 Tim French

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Announcements. CS 188: Artificial Intelligence Fall Today. Tree-Structured CSPs. Nearly Tree-Structured CSPs. Tree Decompositions*

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Ar#ficial)Intelligence!!

Game-playing: DeepBlue and AlphaGo

CS 188: Artificial Intelligence

Game Playing State-of-the-Art

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

CS 5522: Artificial Intelligence II

Foundations of Artificial Intelligence

CS 4700: Foundations of Artificial Intelligence

Foundations of Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

CS440/ECE448 Lecture 9: Minimax Search. Slides by Svetlana Lazebnik 9/2016 Modified by Mark Hasegawa-Johnson 9/2017

Game Playing. Garry Kasparov and Deep Blue. 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Adversarial Search 1

Chapter 6. Overview. Why study games? State of the art. Game playing State of the art and resources Framework

Adversarial Search (a.k.a. Game Playing)

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Adversary Search. Ref: Chapter 5

CPS331 Lecture: Search in Games last revised 2/16/10

Artificial Intelligence

ARTIFICIAL INTELLIGENCE (CS 370D)

Game Playing Beyond Minimax. Game Playing Summary So Far. Game Playing Improving Efficiency. Game Playing Minimax using DFS.

Games and Adversarial Search II

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Project 1. Out of 20 points. Only 30% of final grade 5-6 projects in total. Extra day: 10%

Transcription:

Games Today Saying Deep Blue doesn t really think about chess is like saying an airplane doesn t really fly because it doesn t flap its wings. Drew McDermott I could feel I could smell a new kind of intelligence across the table - Gary Kasparov Deep Blue beats Gary Kasparov - 1997 (3 wins, 1 loss, 2 draws) Game tree search (40 ) Mini Alpha-Beta Pruning Games of chance (30 ) Deep Blue: 32 RISC processors + 256 VLSI chess engines 200 million positions per second, 16 plies Tonight Games in AI Game tree search (40 ) Group exercise: Reversi (50 ) Reversi Tournament (20 ) Games of chance (30 ) Other Games deteristic In AI, games usually refers to deteristic, turn-taking, two-player, zero-sum games of perfect information Deteristic: next state of environment is completely detered by current state and action executed by the agent (not probabilistic) Turn-taking: 2 agents whose actions must alternate Zero-sum games: if one agent wins, the other loses Perfect information: fully observable Games as Search chance States: board configurations Initial state: the board position and which player will move perfect information chess, checkers, go, othello Successor function: backgammon, monopoly returns list of (move, state) pairs, each indicating a legal move and the resulting state Teral test: deteres when the game is over Utility function: imperfect information stratego bridge, poker, scrabble, nuclear war gives a numeric value in teral states (e.g., -1, 0, +1 for loss, tie, win) 1

Intuition Mini-Max 2

3

Mini-Max Properties Complete? Optimal? Against an optimal opponent? Otherwise? Time complexity? Space complexity? Mini-Max Properties Complete? Yes, if tree is finite Optimal? Against an optimal opponent? Otherwise? Time complexity? Space complexity? Mini-Max Properties Complete? Yes, if tree is finite Optimal? Against an optimal opponent? Yes Otherwise? No: Does at least as well, but may not exploit opponent weakness Time complexity? O(bm) Space complexity? O(bm) Mini-Max Properties Complete? Yes, if tree is finite Optimal? Against an optimal opponent? Yes Otherwise? No: Does at least as well, but may not exploit opponent weakness Time complexity? Space complexity? Good Enough? Chess: branching factor b 35 game length m 100 search space bm 35100 10154 The Universe: number of atoms 1078 age 1018 seconds 108 moves/sec x 1078 x 1018 = 10104 4

Alpha-Beta Pruning Do we need to check this node??? 5

Alpha-Beta MinVal(state, alpha, beta){ if (teral(state)) return utility(state); for (s in children(state)){ child = MaxVal(s,alpha,beta); beta = (beta,child); No - this branch is guaranteed to be worse than what already has if (alpha>=beta) return child; } return beta; } alpha = the highest value for MA along the path beta = the lowest value for MIN along the path Alpha-Beta MaxVal(state, alpha, beta){ for along the path for along the path if (teral(state)) return utility(state); for (s in children(state)){ child = MinVal(s,alpha,beta); alpha = (beta,child); if (alpha>=beta) return child; } return beta; } β=84 alpha = the highest value for MA along the path beta = the lowest value for MIN along the path for along the path for along the path for along the path for along the path 6

for along the path for along the path for along the path for along the path β<α prune! for along the path for along the path β=-43 for along the path for along the path for along the path for along the path for along the path for along the path β=-43 β<α prune! β=-43 β=-75 β=-43 β=-75 7

for along the path for along the path for along the path for along the path β=-21 β=-21 Good Enough? Chess: branching factor b 35 game length m 100 β=58 β<α prune! β=-46 β=-46 Alpha-Beta Properties The universe can play chess - can we? Still guaranteed to find the best move Best case time complexity: O(bm/2) search space bm/2 3550 1077 Can double the depth of search! Best case when best moves are tried first The Universe: Good static evaluation function helps! number of atoms 1078 But still too slow for chess... age 1018 seconds 108 moves/sec x 1078 x 1018 = 10104 Partial Space Search Strategies: search to a fixed depth iterative deepening (most common) ignore quiescent nodes Static Evaluation Function assigns a score to a non-teral state Cutoff 8

Evaluation Functions Evaluation Functions Reversi Chess: Number squares held? eval(s) = w1 * material(s) + w2 * mobility(s) + w3 * king safety(s) + w4 * center control(s) +... In practice MiniMax improves accuracy of heuristic eval function But one can construct pathological games where more search hurts performance! (Nau 1981) Better: number of squares held that cannot be flipped Prefer valuable squares NxN array w[i,j] of position values Highest value: corners, edges Lowest value: next to corner or edge s[i,j] = +1 player, 0 empty, -1 opponent score = w[i, j ]s[i, j ] i, j End-Game Databases The MONSTER Ken Thompson - all 5 piece endgames Lewis Stiller - all 6 piece endgames Refuted common chess wisdom: many positions thought to be ties were really forced wins -- 90% for white Is perfect chess a win for white? Deteristic Games in Practice Checkers: Chinook ended 40 year reign of human world champion Marion Tinsley in 1994; used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions (!) Chess: Deep Blue defeated human world champion Gary Kasparov in a 6 game match in 1997. Reversi: human champions refuse to play against computers because software is too good White wins in 255 moves (Stiller, 1991) Deteristic Games in Practice Go: human champions refuse to compete against computers, because software is too bad. Size of board Average no. of moves per game Avg branching factor per turn Additional complexity Chess 8x8 100 Go 19 x 19 300 35 235 Players can pass 9

Nondeteristic Games Involve chance: dice, shuffling, etc. Chance nodes: calculate the expected value E.g.: weighted average over all possible dice rolls Imperfect Information E.g. card games, where opponents initial cards are unknown Idea: For all deals consistent with what you can see compute the i value of available actions for each of possible deals compute the expected value over all deals What is the expected reward? In Practice... Chance adds dramatically to size of search space Backgammon: number of distinct possible rolls of dice is 21 Branching factor b is usually around 20, but can be as high as 4000 (dice rolls that are doubles) Alpha-beta pruning is generally less effective Best Backgammon programs use other methods Probabilistic STRIPS Planning domain: Hungry Monkey : if (ontable) Prob() -> +1 banana Prob(1/3) -> no change else Prob(1/6) -> +1 banana Prob(5/6) -> no change : if (~ontable) Prob() -> ontable Prob(1/3) -> ~ontable else ontable ExpectiMax ExpectiMax( n) = [1] [2] ; [3] ; ; ; U (n) if n is a teral node {ExpectiMax( s ) s children(n)} if n is node P(s)ExpectiMax(s) if n is a chance node s children ( n ) [4] ; if (~ontable){ ; } else { ; } 10

Hungry Monkey: 2-Ply Game Tree ExpectiMax 1 Chance Nodes 1/3 1/6 5/6 1/3 1/6 5/6 1/3 1/3 1/3 1/6 5/6 1/3 1/6 5/6 1/3 1/6 5/6 0 0 1/6 1 7/6 0 1/6 1/3 1/3 1/3 1/6 5/6 1/3 1/6 5/6 1/3 1/6 5/6 1 1 2 1 1 1 2 1 ExpectiMax 2 Max Nodes ExpectiMax 3 Chance Nodes 1/2 1/3 1/3 1/6 5/6 1/3 1/6 5/6 1/6 7/6 1/6 1/6 7/6 1/6 0 0 1/6 1 7/6 0 1/6 0 0 1/6 1 7/6 0 1/6 1/3 1/3 1/3 1/6 5/6 1/3 1/6 5/6 1/3 1/6 5/6 1/3 1/3 1/3 1/6 5/6 1/3 1/6 5/6 1/3 1/6 5/6 1 1 2 1 1 1 2 1 ExpectiMax 4 Max Node 1/2 1/2 1/3 1/3 1/6 5/6 1/6 7/6 1/6 0 0 1/6 1 7/6 0 1/6 1/3 1/3 1/3 1/6 5/6 1/3 1/6 5/6 1/3 1/6 5/6 1 1 2 1 Policies The result of the ExpectiMax analysis is a conditional plan (also called a policy): Optimal plan for 2 steps: ; Optimal plan for 3 steps: ; if (ontable) {; } else {; } Probabilistic planning can be generalized in many ways, including action costs and hidden state The general problem is that of solving a Markov Decision Process (MDP) 11

Summary Deteristic games Mini search Alpha-Beta pruning Static evaluation functions Games of chance Expected value Probabilistic planning Strategic games with large branching factors (Go) Relatively little progress 12