Assignment 2, University of Toronto, CSC384 - Intro to AI, Winter

Similar documents
Assignment 2 (Part 1 of 2), University of Toronto, CSC384 - Intro to AI, Winter

1 Modified Othello. Assignment 2. Total marks: 100. Out: February 10 Due: March 5 at 14:30

CSE 3401 Assignment 4 Winter Date out: March 26. Date due: April 6, at 11:55 pm

2 person perfect information

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! Due (in dropbox) Tuesday, September 23, 9:34am

For slightly more detailed instructions on how to play, visit:

Pay attention to how flipping of pieces is determined with each move.

1 Introduction. 1.1 Game play. CSC 261 Lab 4: Adversarial Search Fall Assigned: Tuesday 24 September 2013

CS151 - Assignment 2 Mancala Due: Tuesday March 5 at the beginning of class

mywbut.com Two agent games : alpha beta pruning

CMPUT 396 Tic-Tac-Toe Game

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

CPSC 217 Assignment 3 Due Date: Friday March 30, 2018 at 11:59pm

ARTIFICIAL INTELLIGENCE (CS 370D)

Adversary Search. Ref: Chapter 5

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

Games (adversarial search problems)

game tree complete all possible moves

Playing Games. Henry Z. Lo. June 23, We consider writing AI to play games with the following properties:

Documentation and Discussion

Comp th February Due: 11:59pm, 25th February 2014

CSC 380 Final Presentation. Connect 4 David Alligood, Scott Swiger, Jo Van Voorhis

CS 4700: Artificial Intelligence

Artificial Intelligence. Minimax and alpha-beta pruning

Adversarial Search (Game Playing)

PROBLEMS & INVESTIGATIONS. Introducing Add to 15 & 15-Tac-Toe

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Game Playing AI Class 8 Ch , 5.4.1, 5.5

Game-Playing & Adversarial Search

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

CPS331 Lecture: Search in Games last revised 2/16/10

ADVERSARIAL SEARCH. Chapter 5

CS 540-2: Introduction to Artificial Intelligence Homework Assignment #2. Assigned: Monday, February 6 Due: Saturday, February 18

Announcements. Homework 1 solutions posted. Test in 2 weeks (27 th ) -Covers up to and including HW2 (informed search)

Board Game AIs. With a Focus on Othello. Julian Panetta March 3, 2010

a b c d e f g h 1 a b c d e f g h C A B B A C C X X C C X X C C A B B A C Diagram 1-2 Square names

Final Project: Reversi

COSC 117 Programming Project 2 Page 1 of 6

Game-playing AIs: Games and Adversarial Search FINAL SET (w/ pruning study examples) AIMA

Project Connect Four (Version 1.1)

Homework Assignment #2

The game of Reversi was invented around 1880 by two. Englishmen, Lewis Waterman and John W. Mollett. It later became

Game Playing in Prolog

Game-playing AIs: Games and Adversarial Search I AIMA

Tic-tac-toe. Lars-Henrik Eriksson. Functional Programming 1. Original presentation by Tjark Weber. Lars-Henrik Eriksson (UU) Tic-tac-toe 1 / 23

Programming an Othello AI Michael An (man4), Evan Liang (liange)

Adversarial Search: Game Playing. Reading: Chapter

Tac Due: Sep. 26, 2012

Artificial Intelligence Lecture 3

Game Playing AI. Dr. Baldassano Yu s Elite Education

Artificial Intelligence

The Mathematics of Playing Tic Tac Toe

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

Taffy Tangle. cpsc 231 assignment #5. Due Dates

CSCE 2004 S19 Assignment 5. Halfway checkin: April 6, 2019, 11:59pm. Final version: Apr. 12, 2019, 11:59pm

A Quoridor-playing Agent

Game Playing. Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial.

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

Artificial Intelligence 1: game playing

Five-In-Row with Local Evaluation and Beam Search

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

B551 Homework 2. Assigned: Sept. 15, 2011 Due: Sept. 29, 2011

Assignment 6 Play A Game: Minesweeper or Battleship!!! Due: Sunday, December 3rd, :59pm

Embedded Systems Lab

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

CS 229 Final Project: Using Reinforcement Learning to Play Othello

CS 730/730W/830: Intro AI

Game Tree Search. Generalizing Search Problems. Two-person Zero-Sum Games. Generalizing Search Problems. CSC384: Intro to Artificial Intelligence

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Adversarial search (game playing)

CMPT 310 Assignment 1

CS 771 Artificial Intelligence. Adversarial Search

CMPUT 657: Heuristic Search

2/5/17 ADVERSARIAL SEARCH. Today. Introduce adversarial games Minimax as an optimal strategy Alpha-beta pruning Real-time decision making

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Contents. MA 327/ECO 327 Introduction to Game Theory Fall 2017 Notes. 1 Wednesday, August Friday, August Monday, August 28 6

Adversarial Search 1

CS Project 1 Fall 2017

Adversarial Search and Game Playing

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Adversarial Search Aka Games

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

CSE 332: Data Structures and Parallelism Games, Minimax, and Alpha-Beta Pruning. Playing Games. X s Turn. O s Turn. X s Turn.

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

AI Approaches to Ultimate Tic-Tac-Toe

Due: Sunday 13 November by 10:59pm Worth: 8%

CS 4700: Foundations of Artificial Intelligence

The first player to construct his or her obelisk is the winner, and if a player has no legal moves, he or she immediately loses the game.

Artificial Intelligence Adversarial Search

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

UNIT 13A AI: Games & Search Strategies. Announcements

Section Marks Agents / 8. Search / 10. Games / 13. Logic / 15. Total / 46

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

CSC 110 Lab 4 Algorithms using Functions. Names:

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

Artificial Intelligence

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi

Transcription:

Assignment 2, University of Toronto, CSC384 - Intro to AI, Winter 2014 1 Computer Science 384 March 5, 2014 St. George Campus University of Toronto Homework Assignment #2 Game Tree Search Due: Mon March 24, 2014 Electronic submission due by 23:59. Silent Policy: A silent policy will take effect 24 hours before this assignment is due. This means that no question about this assignment will be answered, whether it is asked by email or in person. Late Policy: 15% per day after the use of 2 grace days. Total Marks: This assignment represents 13% of the course grade. Part B is a fun optional bonus question. If you complete it sufficiently well, you will qualify for 1 extra grace day, which can be applied to a past, present, or future assignment. Fun Competition: Finally we will be running a competition from the heuristics written for Part B. Your performance in the competition will not influence your mark on this assignment. Handing in this Assignment What to hand in electronically: You must submit your code electronically. Download othello.pl and play.pl from the course web page. For Part A of this assignment, write your name and student number in othello.pl, and fill in the definitions of predicates in their designated space (feel free to define helper predicates as necessary). DO NOT implement anything in play.pl and do not submit it. For optional Part B you must additionally download, modify, and submit the file heuristic c9jdoe.pl where c9jdoe is to be replaced by your cdf userid in the file name and inside in the definition of predicates. We would also like a brief description of your heuristic in hdescription c9jdoe.txt, again with c9jdoe replaced by your cdf userid in the file name. Part C is mandatory. Submission instructions will follow. To submit your file electronically, use the CDF secure Web site: https://www.cdf.utoronto.ca/students Warning: marks will be deducted for an incorrect submission. Since we will test your code electronically, you must: make certain that your code runs on CDF, use the exact predicate names and argument(s) (including the order of arguments) specified, use the exact file names specified in the questions, include all your code for Parts A & B in othello.pl and do not load any file of yours from within this file (other than the already given play.pl). not display anything but the predicate output (no text messages to the user, fancy formatting, etc. just what is in the assignment handout). Clarification Page: Important corrections (hopefully few or none) and clarifications to the assignment will be posted on the Assignment 2 Clarification page, linked from the CSC384 A2 web page. You are responsible for monitoring the A2 Clarification page. Questions: Questions concerning the assignment should be directed by email to the Assignment 2 TA: Brent Mombourquette. He can be reached by email atbgmomb -at - cs.toronto.edu. Please place 384 and A2 in your email subject header.

Assignment 2, University of Toronto, CSC384 - Intro to AI, Winter 2014 2 In Assignment 2 you are supplied with some starter code to build a program that plays the game Othello against a human opponent. 1 Othello Othello is a 2-player board game that is played with distinct pieces that are typically black on one side and white on the other side, each side belonging to one player. Our version of the game is played on a 6x6 chess board, but the typical game is played on an 8x8 board. Players (black and white) take turns placing their pieces on the board. Placement is dictated by the rules of the game, and can result in the flipping of coloured pieces from white to black or black to white. Objective: The player s goal is to have a majority of their coloured pieces showing at the end of the game. Game Ending: Occasionally a player will have nowhere to place their coloured piece. In this case their only valid move is to play a pass and place no piece on the board. The next player then takes their turn. A state where neither player can place a piece is a terminal state. The winner of a terminal state is the player who has more of their colour of pieces exposed on the board. A tie is declared in a terminal state if the number of white and black pieces is equal. Rules: The game begins with four pieces placed in a square in the middle of the grid, two white pieces and two black pieces (Figure 1). Black makes the first move. Figure 1: Initial State Figure 2: Possible moves of black player are shown in grey At each player s turn, the player may place a piece of their own colour on an unoccupied square, if it brackets one or more opponent pieces in a straight line along at least one axis (vertical, horizontal, or diagnonal). For example, from the initial state black can achieve this bracketing by placing a black piece in any of the positions indicated by grey pieces in Figure 2. Each of these potential placements would create a Black-White-Black sequence, thus

Assignment 2, University of Toronto, CSC384 - Intro to AI, Winter 2014 3 bracketing the White piece. Once the piece is placed, all opponent pieces that are bracketed, along any axis, are flipped to become the same colour as the current player s. Returning to our example, if black places a piece in Position 6-d in Figure 1, the white piece in position 5-d will become bracketed and consequently will be flipped to black, resulting in the board depicted in Figure 3. Figure 3: State after the move of black player. Figure 4: Possible moves of white player Now it s white s turn to play. All of white s possibilities at this time are shown as grey pieces in Figure 4. If white places a piece on 4-c it will cause the black piece in 4-d to be bracketed resulting in the 4-d piece being flipped to white as shown in Figure 5. To summarize, a legal move for a player is one that results in at least one of its opponents pieces being flipped. If a player cannot make a legal move, they pass and do not place a piece on the board. Players alternate turns until neither player can take a move, at which point the game ends. The player with the most pieces on the board that display the player s colour wins the game. Othello is Pressman s marketing name for an old game called Reversi. You can get a better feel for the game by playing it at a number of on-line sites. For example, http://gameknot.com/pg/reversi.htm shows the allowed moves when it is your turn. In the case where the online description of the games differs from what is specified here, you must follow the description in the assignment.

Assignment 2, University of Toronto, CSC384 - Intro to AI, Winter 2014 4 2 The Assignment Figure 5: State after the move of white player. You will be provided with the following Prolog code, available for download from the Assignment 2 web page: An implementation of an interactive depth-first minimax game tree search routine in the file play.pl. This file will not work on its own as it needs the definitions of several game-specific predicates. You will not change this file, but please read the code carefully to see what predicates need to be implemented and how they are used in the tree search. To invoke the interactive shell you need to type the queryplay. Assuming all required predicates have already been defined, the interactive game playing shell will prompt the human player to input moves. The player can enter a move (for this game it s just a position pair like [1,3], i.e. 2nd row and 4th column because indices start from 0), which will then be checked for validity (using a predicate you have to write). To play a pass move simply enter n. Your validmove predicate should check the proposed move allowing a pass only if no other move is legal. (Observe that the predicate read(proposed) is used to read the user s move this will bind the variable Proposed to anything the user enters, so you have to check that they have entered a valid move in the right syntax (i.e. a pair of numbers enclosed in brackets, or the character n ). When it is the computer s turn the engine will invoke a minimax search for the best move. This search is done to a bounded depth, and you can set the depth bound. You should set a bound that yields reasonable performance. An implementation of an interactive game tree search with alpha-beta pruning in the file abplay.pl. This is very similar toplay.pl except it does the alpha-beta pruning. You will NOT change this file. Some starter code for your Othello implementation is in the file othello.pl. You are given a prespecified state representation of the game as a list of lists. The board is treated as a two dimensional 6x6 array indexed by a pair of numbers [X,Y] where X is the row position and Y is the column position, each in the range 0 5. The file also contains a number of utility routines that allow you to set and get indexed squares on the board. You have to define various predicates to interface with the game tree search routine. This involves writing code to generate moves in the game, testing whether or not positions are terminal, evaluating the heuristic merit of positions in the game, etc. Full documentation on the predicates needed by the game tree search routine is provided at the beginning of the file play.pl. Please do not change play.pl, all your implementation must be done inothello.pl. A sample implementation of an interactive tic-tac-toe game in the file ttt.pl where player 1 (MAX) is a human and player 2 (Min) is the computer. This sample game illustrates how to implement the routines required by game tree search. To run the game, simply load the file ttt.pl and the ask the query play. You will be prompted to choose your first move (i.e. a number between 1 to 9 followed by a period). Then, the computer will choose a move, and it s your turn again, and so on.

Assignment 2, University of Toronto, CSC384 - Intro to AI, Winter 2014 5 This assignment is broken into 3 subparts: (A) implementing Othello on a 6x6 board, (B) designing a heuristic function and (C) testing your code with given test boards, comparing the simple minimax and alpha-beta pruning. Each part is described in detail below. Please note that your implementation should contain sufficient comments and not be contorted or overly complex. Poor implementation style may cause deductions up to 10%. Part A [75 Marks]: Othello Implement the Othello game by adding your code to the supplied starter file othello.pl. In order to accomplish this you have to implement several predicates (feel free to define your own helper predicates for more complex predicates likenextstate): 1. initialize(initialstate,initialplyr) 2. winner(state,plyr) 3. tie(state) 4. terminal(state) 5. moves(plyr,state,mvlist) 6. nextstate(plyr,move,state,newstate,nextplyr) 7. validmove(plyr,state,proposed) 8. h(state,val) 9. lowerbound(b) 10. upperbound(b) Most of these predicates are based on the given state representation. Utilize the given utilities (e.g. get and set a value at a position) to determine the possible next moves: you must implement the predicate moves(plyr,state,mvlist) so that it returns a listmvlist of all legal movesplyr can make in the given statestate. The list of moves returned by this predicate must be sorted by position from top row to bottom row and within a row from left column to right column. E.g., if moves into positions [1,1], [0,0], [2,7],[0,2], [1,5] are all possible, then you must return this list of moves in the following order [0, 0], [0, 2],[1, 1], [1, 5], [2, 7]. Similarly, you must implement the predicate nextstate(plyr,move,state,newstate,nextplyr) that changes the current board State by playing Move. (Remember that applying a move can cause changes along several different directions). You can use the given helper predicate showstate to debug nextstate. In your implementation, account for the fact that the game can end with a tie and implement thetie andwinner predicates. The predicateh(state,val) requires that you design a heuristic function for the game. See Part B before doing so. If you decide NOT to do Part B, to get credit for Part A, you still need to define a very simple heuristic: Your h(s,v) returns V = 0 for any non-terminal state S. If S is a terminal state, h must return a positive value (say 100) for a win state, a negative value (say -100) for lose state, and 0 for a tie state. You can see that this h provides no guidance in the depth-bounded search. What to hand in for Part A: 1. Electronic Submission Submit your othello.pl (using submit). Make sure you put your name in the comments of that file. (You must not include any of the code in play.pl nor should you submit it.) Be sure to document your predicate definitions.

Assignment 2, University of Toronto, CSC384 - Intro to AI, Winter 2014 6 Part B [Optional Bonus Question - 1 Grace Day + Competition Fun]: State Evaluation Function This question is optional. If you complete it sufficiently well, you will earn one extra grace day for past, present, or future use. Also, all implemented heuristics will be entered into the 2nd Annual Danny Ferreira Memorial Competition. The winner(s) of this competition will be awarded a prize. The competition and prize are unrelated to the mark you receive on this assignment. As mentioned above, play.pl requires implementing the heuristic function h(s,v). If you decide to do Part B, you have to implement a smarter heuristic function as described below. In order to receive the Bonus of 1 Grace Day you must perform the experiments in Part C with the heuristic you defined here. Heuristic Functions for Othello Game Since at the end, the player with more pieces wins the game, you might think that the evaluation function h(s) = V 1 V 2 (where V 1 and V 2 are the number of pieces for player 1 and 2, respectively), is ideal. This is only true if we expand all nodes in the search tree to reach the terminal nodes (which is practically impossible). For a non-terminal state, having more pieces has no meaning (it could even be worse as seen in figure 6) as many flips might occur in future moves. Instead, we focus more on the stable pieces on the board, i.e. those pieces that cannot be flipped Figure 6: Maximum pieces is not a good strategy: white has a lot more pieces, while black has only 1. It is black s turn. So, she puts a piece in position a1, white has to pass, then black puts a piece in h8, white passes, black plays h1, white passes, and finally black plays a8. Black wins: 40 black piece versus 24 white pieces! anymore. Corner positions, once played, remain immune to flipping for the rest of the game (because there is no adjacent opposite colour to cause a flip): thus a player can use a piece in a corner of the board to anchor groups of pieces (starting with the adjacent edges) permanently. So capturing a corner often proves an effective strategy when the opportunity arises. More generally, a piece is stable when, along all four axes (horizontal, vertical, and each diagonal), it is either on a boundary of the game board, or in a filled row, or next to a stable piece of the same colour. The more stable pieces you have (and the less stable piece your opponent has) the better. So, you may count the number of stable pieces for both players and use them as a good measure to evaluate states. Another idea is mobility. An opponent playing with reasonable strategy will not easily relinquish the corner or any other good moves for you to play. So to achieve these good moves, you must force your opponent to play moves which make these good moves available. The best way to achieve this involves reducing the number of moves available to your opponent. If you consistently restrict the number of legal moves your opponent can make, then sooner or later they will have to make an undesirable move. An ideal position involves having all your pieces in the center surrounded by your opponent s pieces. In such situations you can dictate what moves your opponent can make. You can also do your own research to find a wide range of other good heuristics (for example, here is a good start: http://www.radagast.se/othello/help/strategy.html).

Assignment 2, University of Toronto, CSC384 - Intro to AI, Winter 2014 7 The Competition: How to package your code so we can run it in the competition If you decide to participate in the competition (Please do, it will be fun!), you should submit a file named heuristic <id>.pl, where <id> should be your cdf userid. For example, a student John Doe with cdf userid c9jdoe would submit the file namedheuristic c9jdoe.pl. This file should contain a predicate named<id> h/2 (John would name itc9jdoe h). This predicate can rely on the functions you are required to implement (eg. tie, winner, nextstate etc). Any helper predicates should also be included in the heuristic file. To avoid name clashes, prefix any helper predicates with your id. Note that any calls made to them from this file should use the new predicate names. You should also duplicate the functions lowerbound/1 and upperbound/1 in this file, again, prefixing them with your cdf userid as noted in the example below. The following is an example submission by John. He clearly did not understand the game and didn t use comments properly, but he got the naming convention right! %------------------------- % Surname: Doe % First Name: John % Student Number: 123456789 % Helper predicates c9jdoe_max(c1, C2, C) :- C1 >= C2,!, C=C1. c9jdoe_max(c1, C2, C2) :- C1 < C2. c9jdoe_count([], 0). c9jdoe_count([e L], C) :- length(e, C1), c9jdoe_count(l, C2), c9jdoe_max(c1, C2, C). % The actual heuristic c9jdoe_h(state, Val) :- terminal(state),!, Val=42. c9jdoe_h(state, Val) :- c9jdoe_count(state, Val2), Val is Val2-4. % The bounds c9jdoe_lowerbound(-3). c9jdoe_upperbound(300). Note: In the competition, the search will expand as many nodes as it is able to within a given timeout. So, there is a tradeoff between the quality of the heuristic and how fast it is to compute. A better heuristic would provide better guidance to the search, but a fast one would allow more nodes to be expanded. What to hand in for Part B: To get credit for Part B, you must do the followings:. 1. Electronic Submission Create an English description and justification of the heuristic you implemented. You are welcome to do a little bit research of your own to come up with a better evaluation function. Make sure to cite all references you used (if any) for this question. Please submit this in a file called hdescription c9jdoe.txt where c9jdoe is replaced with your cdf userid. 2. Electronic Submission Your implementation of predicate h must be in the othello.pl which you will have submitted in Part A. If you are participating in the competition, you also submit heuristic c9jdoe.pl appropriately renamed to replace c9jdoe with your cdf userid, and with the content as described above.

Assignment 2, University of Toronto, CSC384 - Intro to AI, Winter 2014 8 Part C [10 Marks]: Testing Boards & Comparing MiniMax and α β Pruning In this part, you will test your code by running it with the given test boards. You will also compare simple MiniMax and Alpha-Beta Pruning. We have provided you with the implementation of Alpha-Beta Pruning in abplay.pl. Download it from the the assignment web page. Students who did not elect to try the bonus question should run their tests on the code from Part A. Students who did try Part B should test their code on Part B as well, if they wish to receive the bonus grace day. (It s fun to see how well your heuristic is performing (for us and you)!) 1. Testing your code with MiniMax: Trace your code on test boards 1 to 3 (provided in testboards.pl) using the MiniMax algorithm with a depth bound of 5. I.e., testboard1(st), mmeval(2,st,val,bestmv,5,sef) This will bind SeF to the number of states searched, and BestMv to the computed move for board1. Repeat this for testboard2 and testboard3. Submit your results following the submission instructions. 2. Testing your code with alpha-beta pruning: Trace your code on test boards 1 to 3 (provided in testboards.pl) using the alpha-beta algorithm with a depth bound of 5. To do so, you must load the file abplay.pl instead. Make sure NOT to loadplay.pl!). Then execute the following: testboard1(st), lowerbound(alpha), upperbound(beta), abeval(2,st,val,bestmv,5,sef,alpha,beta) This will bind SeF to the number of states searched, and BestMv to the computed move for board1. Repeat this for testboard2 and testboard3. Submit your results following the submission instructions 3. In one or two paragraphs, compare the results displayed in your two tables. Submit this description following the submission instructions. Submission instructions to follow shortly. Good Luck and Have Fun!