(Provisional) Lecture 31: Games, Round 2

Similar documents
Lab 11: GoFirst and Nim 12:00 PM, Nov 19, 2017

Project 4: Game Due: 7:00 PM, Dec 6, 2017

Adversary Search. Ref: Chapter 5

game tree complete all possible moves

CMPUT 396 Tic-Tac-Toe Game

Adversarial Search 1

Artificial Intelligence

2 person perfect information

2359 (i.e. 11:59:00 pm) on 4/16/18 via Blackboard

Playing Games. Henry Z. Lo. June 23, We consider writing AI to play games with the following properties:

mywbut.com Two agent games : alpha beta pruning

Games (adversarial search problems)

ARTIFICIAL INTELLIGENCE (CS 370D)

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

CS 5522: Artificial Intelligence II

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

CS61B Lecture #22. Today: Backtracking searches, game trees (DSIJ, Section 6.5) Last modified: Mon Oct 17 20:55: CS61B: Lecture #22 1

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

CS 188: Artificial Intelligence Spring Announcements

Game Tree Search 1/6/17

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

CPS331 Lecture: Search in Games last revised 2/16/10

CSE 332: Data Structures and Parallelism Games, Minimax, and Alpha-Beta Pruning. Playing Games. X s Turn. O s Turn. X s Turn.

Game-playing AIs: Games and Adversarial Search I AIMA

Artificial Intelligence. Minimax and alpha-beta pruning

Announcements. Homework 1 solutions posted. Test in 2 weeks (27 th ) -Covers up to and including HW2 (informed search)

Game Playing State-of-the-Art

Game Playing State of the Art

Artificial Intelligence Adversarial Search

CS 188: Artificial Intelligence

Lecture 33: How can computation Win games against you? Chess: Mechanical Turk

CS 188: Artificial Intelligence. Overview

ADVERSARIAL SEARCH. Chapter 5

Artificial Intelligence

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

Game-playing: DeepBlue and AlphaGo

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

Math 152: Applicable Mathematics and Computing

Game Playing AI Class 8 Ch , 5.4.1, 5.5

CS510 \ Lecture Ariel Stolerman

Tic-tac-toe. Lars-Henrik Eriksson. Functional Programming 1. Original presentation by Tjark Weber. Lars-Henrik Eriksson (UU) Tic-tac-toe 1 / 23

CS 331: Artificial Intelligence Adversarial Search II. Outline

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

Artificial Intelligence

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Contents. MA 327/ECO 327 Introduction to Game Theory Fall 2017 Notes. 1 Wednesday, August Friday, August Monday, August 28 6

CS 4700: Artificial Intelligence

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

CONTIG is a fun, low-prep math game played with dice and a simple game board.

Programming Project 1: Pacman (Due )

Experiments on Alternatives to Minimax

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

CS 771 Artificial Intelligence. Adversarial Search

CSE 473: Artificial Intelligence Fall Outline. Types of Games. Deterministic Games. Previously: Single-Agent Trees. Previously: Value of a State

Game Playing AI. Dr. Baldassano Yu s Elite Education

CS188 Spring 2010 Section 3: Game Trees

Adversarial Search: Game Playing. Reading: Chapter

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

Artificial Intelligence Lecture 3

Board Game AIs. With a Focus on Othello. Julian Panetta March 3, 2010

Game Theory and Randomized Algorithms

CS 4700: Foundations of Artificial Intelligence

Adversarial Search (Game Playing)

Game-Playing & Adversarial Search

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Artificial Intelligence 1: game playing

CS151 - Assignment 2 Mancala Due: Tuesday March 5 at the beginning of class

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

CS 188: Artificial Intelligence

CS188 Spring 2010 Section 3: Game Trees

Game Tree Search. Generalizing Search Problems. Two-person Zero-Sum Games. Generalizing Search Problems. CSC384: Intro to Artificial Intelligence

CS61B Lecture #33. Today: Backtracking searches, game trees (DSIJ, Section 6.5)

CS 188: Artificial Intelligence Spring 2007

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

CSC384: Introduction to Artificial Intelligence. Game Tree Search

More Adversarial Search

UNIVERSITY of PENNSYLVANIA CIS 391/521: Fundamentals of AI Midterm 1, Spring 2010

CS188 Spring 2014 Section 3: Games

INF September 25, The deadline is postponed to Tuesday, October 3

Monday, February 2, Is assigned today. Answers due by noon on Monday, February 9, 2015.

Girls Programming Network. Scissors Paper Rock!

Rules of the game. chess checkers tic-tac-toe...

Game playing. Outline

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

Game Playing. Philipp Koehn. 29 September 2015

Introduction Solvability Rules Computer Solution Implementation. Connect Four. March 9, Connect Four 1

! HW5 now available! ! May do in groups of two.! Review in recitation! No fancy data structures except trie!! Due Monday 11:59 pm

Foundations of Artificial Intelligence

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Adverserial Search Chapter 5 minmax algorithm alpha-beta pruning TDDC17. Problems. Why Board Games?

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

Transcription:

CS17 Integrated Introduction to Computer Science Hughes (Provisional) Lecture 31: Games, Round 2 10:00 AM, Nov 17, 2017 Contents 1 Review from Last Class 1 2 Finishing the Code for Yucky Chocolate 2 3 Other Game Components 3 3.1 Human Players....................................... 3 3.2 Referee........................................... 4 3.3 AI Players.......................................... 5 4 The Minimax Algorithm 5 4.1 Implementing Minimax.................................. 6 5 Summary 7 Objectives By the end of this lecture, you will know: ˆ a more advanced and more efficient algorithm to determine a player s best move, which you will utilize within your Game project ˆ the structure of Human and AI player modules, as well as the Referee module, for your Game project 1 Review from Last Class Last time I introduced to you the notion of a two-person, finite, complete-information, alternatingmove, zero-sum game. Everything about the game is clearly visible to all players; there s no random drawing of cards from a deck. I talked about strategies for figuring out how to win such a game. You may recall, I drew a game tree for a 2x2 Yucky Chocolate and from values at the terminal nodes (i.e leaves), I filled in values at the other nodes. One thing we worked out is that when you know the values of all the leaves, you can figure out all the values at the nodes above the leaves. And, a smart player would go to the leaf that is best for them. If I m player 1, I want to maximize the values below. If I m player 2, I want to minimize the values below. We wrote a good chunck of code for Yucky Chocoloate, which we ll go through as a review:

type which_player = P1 P2;; type state = int * int * which_player;; let initial_state = (2, 2, P1);; type move = Row of int Col of int; As you can probably guess, the first type, which_player, is used to keep track of players. The second type, state, is used to keep track of a state. As the state of Yucky Chocolate depends on how many rows and columns are left, as well as whose turn it is, we define a state as holding all this information. The initial_state simply keeps track of how the game should start. For our implementation, it should be a 2x2 game, and player 1 should go first. Finally, a move is simply the number of rows or columns a player eats, and is represented as such. Our next_state code would look something like this: let next_state ((n,k, w): state) (m:move) : state = match m, w with Row p, P1 when p <= n - > (n - p, k, P2) Row p, P2 when p <= n - > (n - p, k, P1)... Your next_state will behave differently, but should take in the same thing - a state and a move. This procedure produces the state of the game after the move m has been applied to the current state. For Yucky Chocolate, this is simply changing whose turn it is and subtracting the number of rows/columns. We also wrote game_status: type status = Win of which_player Draw Ongoing of which_player let game_status (s:state) : status = match s with (0,0,w) - > Win of w (_,_,w) - > Ongoing of w Your game_status will be very similar. It takes in a state s and produces the status of the game: whether a player has won, and if so, which player, whether the game is a tie game, or whether the game is ongoing, and if so, whose turn it is. Finally, we wrote a value procedure that produces the value of terminal nodes. let value (s:state) : float= match s with (0,0,P1) - > 1.0 (0,0,P2) - > - 1.0 _ - > failwith "value undefined for nonterminal states" The procedure takes in a state, s, and produes a value - a higher positive number is desired by player 1, and a more negative number is desired by player 2. This value procedure only works for terminal nodes. We re going to need more information to figure out the best moves to make, and doing so is the subject of the next lecture and a half. 2 Finishing the Code for Yucky Chocolate There are a few more things we need to do before we finish writing the code for Yucky Chocolate. 2

First up, we need to write a procedure called string_of_player. This function should take as input an argument, w, of type which_player, and output a string representing the current player. For a game like Tic-Tac-Toe, this may output X for player one and O for player two. For other games, perhaps the strings Player 1 and Player 2 work just fine. Next, we should write string_of_state. This procedure takes in a state, and returns a string that represents the state. This typically means returning a string representation of the game board. For example, string_of_state might return "[ ][ ]\n[x][ ]\n", which prints out as: [ ][ ] [X][ ] to represent the starting state of a 2x2 Yucky Chocolate game. The third additional piece is string_of_move, which returns the string representation of an input move. For Yucky Chocolate, string_of_move(row 3) might return 3 rows, which could be used to print out: Player 1 makes the move: 3 rows. The final procedure we need to write is move_of_string, which takes in a string as input, and returns a move. It s used to transform human input into the internal representation of a move. For example, for Connect 4, move_of_string("4") might produce Col 4, which represents a move in which the player puts a game piece in the fourth column. If the input string is nonsense, then this procedure should fail. 3 Other Game Components Our game needs to have players, and we re going to implement both human and artifical intelligence players. Human players can type in their next move, where AI players select their move based on the approach we just talked about. We ll also implement a referee, who can start the the game, manage plays, update game status, and report who won. (AI players will be covered later). 3.1 Human Players The signature for a PLAYER is as follows: module type PLAYER = sig module PlayerGame : GAME val next_move : PlayerGame.state - > PlayerGame.move end We ve implemented a struct that partially extends this signature below. The human module only has two things in it - one of them is the game that this is a player for. You cannot just have an arbitrary player, it has to be a chess player or a tic-tac-toe player. The human module also has a next_move procedure that takes the current state of the game and produces the next move that this player wants to make. For a human player, we read a line of input from the keyword, and call the move_of_string procedure which produces a move from the entered string. 3

You may wonder, what is the try thing in the code below? What if the player, rather than entering something nice and clear like r 3, which could represent eating three rows in Yucky Chocolate, has typed something like Hello!, thinking they are working with Eliza. If you try to call move_of_string on Hello! it s going to fail. So, try says try to execute this code, and if there are any problems, do something special. Assuming move_of_string did not fail, we then needs to check if m is a legal move. If so, we should return the move m. If it isn t, we should output a message saying it was an illegal move, and re-call next_move to re-prompt the user to enter a move. If something went wrong while doing this, the try block will catch the errors and try to execute the code after the word with. One possible error is an End_of_file error which will be thrown when CTRL+D is pressed - this is a good way to exit your program. If there is some other failure, such as a problem with move_of_string we can match on Failure message, print out the failure message, and re-prompt. module TestHumanPlayer = struct module PlayerGame = Game open PlayerGame let rec next_move s = try let m = move_of_string (read_line ()) in (* TODO: replace the below expression (between the if and then) * with the proper functionality *) if m is a legal move then m else let () = print_endline "Illegal move." in next_move s with End_of_file - > failwith "exiting." Failure message - > print_endline message ; next_move s end ;; 3.2 Referee The referee is also a module. It s what sets up and runs the game. It has a notion of the game being played. It also has the notion of two modules, each of which implement a player module - one of which being a human module, one being an AI module (the AI module will be covered later). Don t worry too much about the code below - it s a bit convoluted. module Referee = struct (* Change these module names to what you've named them *) module CurrentGame = Game module Human : PLAYER with module PlayerGame := CurrentGame = HumanPlayer open CurrentGame let play_game():unit = let rec game_loop (s: state): unit = print_endline (string_of_state s); 4

match game_status s with Win p - > print_endline ((string_of_player p) ˆ "wins!") Draw - > print_endline " D r a w " Ongoing p - > print_endline ((string_of_player p)ˆ"'s turn."); let move = Human.next_move s in print_endline ((string_of_player p) ˆ " makes the move " ˆ (string_of_move move)); game_loop (next_state s move) in try game_loop initial_state with Failure message - > print_endline message end;; Referee.play_game ();; The function play_game runs the game_loop procedure over and over again. game_loop prints out the state of the game, and checks to see if the game is over. If one of the players won, it prints out which player won. If there was a draw, it prints out Draw.. If the game is ongoing, it prints out the current player. Then, it asks the current player to calculate their next move for the given state, s. After that, it prints out the move that the player made, and run the game_loop with the game state that results from the player making that move. The last line of code, Referee.play_game();;, is what actually runs the game. The actual referee that you will be implementing will be more complicated than this. It allows the person running the program to chose whether they want the game to run with two human players, two AI players, or an AI and a human player. 3.3 AI Players Like a human player, an AI player has to have a game associated with it. It also has to have a way, given a state, to choose a next move. It decides what move to make by looking at the game state, itself! module TestAIPlayer = struct module PlayerGame = Game open PlayerGame (* TODO *) let next_move s = failwith "not yet implemented" end ;; 4 The Minimax Algorithm Our value procedure determines values at terminal nodes. When the game is over, it checks how good a game was for Player 1. For Yucky Chocolate, it s +1 or 1. For checkers, it might be how many more pieces did player 1 capture than did player 2? Player 1 s goal is generally to maximize 5

the value at the end of the game, because the value represents how happy player 1 is at the end of the game. So Player 2 should try to minimize the value. We also worked out a naive algorithm last time: we said we could assign values to each node that isn t terminal and call this nvalue. Player 1 can look at child nodes and pick the maximum of them. Likewise, player 2 can look at child nodes and pick the minimum of them. To do so, player 1, in a given state s, looks at the value of all states that arise from taking each legal move starting at s, and picks the move that leads to the highest value. This algorithm will require a few procedures: 1. legal_moves s - given a state, s, produce a list of all legal moves at that state. 2. next_state m s - given a move, m, and a state, s, produce the state you get to by making move m. 3. (map (fun s ->next_state m s)(legal_moves s)) - given a state s, produce a list of all possible next-states. 4. argmax and argmin - which of the legal moves led to the best outcomes? Player 1 would select the move that produces a state with the highest nvalue: move = argmax (map (fun s - > next_state m s) (legal_moves s)) nvalue Player 2, on the other hand, would select the move that produces a state with the lowest nvalue. This procedure uses argmin. move = argmin (map (fun s - > next_state m s) (legal_moves s)) nvalue With this approach, we can associate a computed value to any node. A terminal node has its given value. All other nodes values are the maximum value of its children, if it s player 1 s turn, or the minimum value of its children if it s player 2 s turn. However, we can take this a step further: as we move up the theoretical tree of game states, we know that whenever we move up a level, the players switch. We know that player 2 will make the move that leads to the game state with the lowest value possible, and that player 1 will make the move that results in a game state with the highest value possible. So, as we move up the tree, we can switch between storing the minimum and the maximum value of the child nodes in the current node. In doing so, we can take into account the fact that the each player has different goals. The process of switching between the min and max of the child node values is called minimax. 4.1 Implementing Minimax For small games, we can run minimax until we reach the terminal nodes of the game, and plan moves accordingly. However, for larger games, this simply isn t possible. Not to mention, knowing the value of a node simply tells you whether you can win or not - it doesn t give you any information about how to get to the win state. 6

To address the former, we need to give our minimax implementation a maximum depth to analyze. To address the latter, we modify the procedures argmin and argmax so that they output a pair, where the first element of the pair is the move to make, and the second element is the value of the game state that results from that move. Overall, our improved implementation of minimax is as follows: ˆ Input: a game state to start at ˆ Output: a (value, move option) pair where the value is the value of the game to P1 if everyone moves optimally at each state, and the move option is the optimal move (if any) for whichever player is supposed to move at the input state. ˆ Algorithm: If the input state is final, return the input state. Otherwise, calculate the game state for each valid move from the current state, and pass each game state in as recursive input to minimax. Thus, you ll get a value for each child state (relative to the input state). If it s P1 s turn, calculate the max of the child states values, and return that. If it s P2 s turn, calculate the min of the child states values, and return that. However, we mentioned earlier that it s not feasible to calculate the game state values for every terminal state in a complicated game. So, what we can do is create another function called estimate_value that estimates the value of a nonterminal node. Then, after the minimax function reaches a given depth, we call estimate_value on the child nodes, and pass those values back up (as opposed to calling minimax until we reach terminal nodes). 5 Summary Ideas ˆ We introduced the idea of a subtree search algorithm to efficiently implement an AI player. ˆ We ve discussed the benefits and tradeoffs of going deeper in the subtree, and writing a more efficient or accurate eval_leaf procedure. Skills ˆ We discussed implementation details for several procedures that you will write in the Game project. ˆ We discussed implementation details of a Human module, an AI module, and a Referee module. 7

Please let us know if you find any mistakes, inconsistencies, or confusing language in this or any other CS17document by filling out the anonymous feedback form: http://cs.brown.edu/ courses/cs017/feedback. 8