Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku

Size: px
Start display at page:

Download "Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku"

Transcription

1 Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku Guanlin Zhou (gz2250), Nan Yu (ny2263), Yanqing Dai (yd2369), Yingtao Zhong (yz3276) 1. Introduction: Reinforcement Learning for Gomoku Game Gomoku or Five in a Row, is a two-player strategy board game using black and white stones. Players alternately place stone on board and the first player who has five stones in a line with same color win the game [13]. There are some different rules for Gomoku such as Free-style Gomoku and Renju. Although it had been proved that the first player has a winning strategy both in Free-style Gomoku [1] and Renju [8], complex board state still need further study. Many algorithms have been developed on playing Gomoku for several decades. Traditional approach utilize tree search with alpha-beta pruning [4, 7]. In 2006, Coulom proposed a novel approach named Monte Carlo Tree Search, which combined tree search with Monte-Carlo evaluation [3]. MCTS is a heuristic search algorithm which takes the simulation result from child node as an evaluation function for the tree search. One of the most advantages of MCTS is that it doesn t require domain-specific knowledge and it s easy to apply the model to another domain. However, in the process of expanding tree, it is hard to maintain balance between the exploitation of deep variants after moves with high average win rate and the exploration of moves with few simulations. In 2006, Kocsis and Szepesvari proposed an idea to improve the result of MCTS [5, 6]. In this paper, they used UCT algorithm to solve previous mentioned problem. UCT is an algorithm based on UCB1, which is one of the algorithm for multi-armed bandit problem. In our project, UCT algorithm is implemented on Gomoku game, and our revised version can win amateur human players with a good confidence level. 2. Preliminaries: UCT algorithm We used the Upper Confidence Bounds for Tree (UCT) algorithm from the paper A Survey of Monte Carlo Tree Search Methods [2] as a base model of our implementation. The UCT algorithm is one of the most popular algorithms in Monte Carlo Tree Search family. It s run based on the current state and output a suggested best action given certain computational allowance. UCT algorithm consists of four parts for each step: 1. Picking a suitable move; 2. Simulation of following moves till termination; 3. Recording the rewards; 4. Selecting the best rewarded move (after doing 1 through 3 for certain rounds). Full algorithms goes as follows, with explanations of notations in the context of a Gomoku game.

2 s 0 : current configuration of the board as a double array (11 x 11); v 0 : a node representing the current state s 0 with extra information like parent, children, rewards, etc; v l : a node in the next layer, child of v extra 0 move from v 0 ; nonterminal v: the state associated with v is not a winning or losing or tie state; fully expanded v: every possible move in the state associated with v is explored; Cp: set to 1/ 2 to satisfy the Hoeffding inequality with rewards in the range [0,1]; [3][8] action: a possible play move, not overlapping with current pieces on the board; f(s(v), a): generate a new state (configuration of board) according to a move and current configuration of board; Q(v): the current possible reward for v; N(v): the current number of simulations (games) v is involved in; : reward of a game. +1 for winning, -1 for losing, and 0 for tie. On top of UCT algorithm, we made many changes to form our actual implementation, which we will explain in details in the next section. 3. Implementation and Algorithm Improvements 3.1 Environment and Required Packages We ve implemented our Gomoku game environment and all related parts under Python The required packages include pygame, which provides a simple and elegant game UI environment, and numpy. Before testing our code, please make sure all these packages are installed successfully. 3.2 Basic Structures Our project is put under folder rl-gomoku. All the game related images are put under sub folder rl-gomoku/imgs and our source code is under rl-gomoku/src. Under folder rl-gomoku/src there are five python files:

3 main.py : It provides the main entrance of our gomoku game environment. And you need to set parameters when executing main.py. There are two game modes: (1) Human play with AI, under this mode you should input two parameters, the command should be like python main.py n t, where n should be the number of total games and t is the computational budget set for MCTS AI, which should be a floating number indicates how many seconds it has for each step. (2) AI play with AI, under this mode you should input three parameters, the command should be like python main.py n t1 t2, where t1 and t2 are the computational budget set for each AI player. Fig. 3.1 Gomoku game environment gomuku.py : The game environment is implemented in this file. Under this file the class gomuku_game is defined and implemented, which saves all current game related data, such as current turn, current board state, and this class also contains the functions to display game UI, update game state and receive I/O inputs, etc. simple_ai.py : Simple AI provides a basic AI agent for the Gomoku game. Which defines some naive game playing rules and provides prior information for our more advanced version of game AI which is implemented using UCT and MCTS. mcts_ai.py and uct_tree.py : The UCT Monte Carlo Tree Search algorithm is implemented in mcts_ai.py and uct_tree.py. These two files are the core section of our project and we ll discuss more about their details in next section. 3.3 Simple AI for Gomoku With MCTS and UCT, the way to find the next move is to do an exhaustive search over the whole board and to find a position that the algorithm thinks is the best. However, this process is very time-consuming and often ineffective. This is because the algorithm does not know how to play Gomoku, each time the position it picks is only slightly better than picking randomly.

4 In our very early versions, we tried to implement the Gomoku AI using the original UCT MCTS algorithm as illustrated in the previous section. However we found that unless the computational budget of each step is super large (like 20 minutes or more), the pure random algorithm will not return good results. As the possibilities grows rapidly when we are doing tree search. For example, if we try to explore next ten steps in the early game, an approximate estimate of the total number of possibilities would reach So apart from the implementation of MCTS and UCT, we also employed some predefined, knowledge-based, simple but effective AI rules for choosing the next move. The key to the AI is the pre-defined rules. The rules can be divided into two main parts, rules for attack and rules for defence. There are totally 7 rules we defined for the AI, specifically, if an empty position: 1. Can contribute to at least five consecutive stones, then attack on the position 2. Is next to four consecutive stones, then attack on the position 3. Is next to three consecutive stones without blockage on both ends, then attack on the position 4. Can contribute to at least five consecutive adversary stones, then defend on the position 5. Is next to four consecutive adversary stones, then defend on the position 6. Is next to three consecutive adversary stones without blockage on both ends, and there are less than three consecutive stones on the player s side, then defend on the position 7. If none of the cases above does it fall in, then attack on the best position recorded (this is only for generating default policy) The strategies we utilized to enforce these rules were to scan the whole board to check against all the cases in all four directions. Whenever we see an empty cell, we check whether there are at least four consecutive stones next to it in horizontal, vertical, left-diagonal, and right-diagonal directions. And similar strategies were used for exploring other cases. 3.4 UCT Algorithm for Gomoku As the basic algorithm of UCT has been illustrated in the previous sections. In this section we will discuss how we ve implemented this algorithm for the Gomoku game and the improvements we ve made on the origin algorithm. The first part will be introduced is the basic data structure of tree node and our method to find a specific node with given game state. The tree node class is defined in uct_tree.py. Like ordinary search tree nodes, each node saves its tree structure information (parents, children), the number of it has been visited, the total reward, current turn and whether this node is an terminal node. One thing we have to notice is that in a Gomoku game, a single state may have different parents. We can use the following state as an example, let the locations of three black pieces be B 1 (4, 5), B 2 (5, 4), B 3 (8, 6) and three white pieces be W 1 (6, 4), W 2 (6, 6), W 3 (7, 6). The sequence B 1 W 1 B 2 W 2 B 3 W 3, B 2 W 1 B 1 W 2 B 3 W 3, B 1 W 1 B 3 W 2 B 2 W 3 and all other possible sequences will lead to the same state. However no matter how different the sequence of previous moves are, with the same current state, the best decision for next step should always be same.

5 Fig. 3.2 An example game state Thus in our tree structure, each node could have multiple children and parents. Actually this feature isn t a bad thing for our tree search algorithm since when we are doing tree search and found a tree node previously been detected and developed, we can keep using the information of that node rather than do the same work again. By doing this, previous search result of will help to accumulate more information on a larger extent for the next step and the overall search efficiency is increased. Rather than saving each game state in the tree nodes class, we ve created a multi-layered hash dictionary to keep recording the relationship between game state and node. This idea is inspired by the method for Linux to manage its huge virtual memory space (Linux has a four level page table to save all the frame numbers on real physical address). Our state to node dictionary has total three levels, and each level uses different rows of state information for indexing (row 4-6 for first level, row 0-3 for second level, and row 7-11 for the last level). The reason we ve used the middle rows for the first level is that for most games, the players tends to place the pieces in the middle first, thus using these rows for first level indexing will help the data to distribute more evenly in the dictionary. There is a function in mcts_ai.py named state2node(game_ state), this function will use the layered dictionary to determine whether a state has been detected before or not. For an existing node, it will simply return this node and for a state not explored before, it will create a new entry in the dictionary and return the new node. After introducing how we ve managed our search tree data structure. The next topic we will discuss is how we ve adapted UCT tree search algorithm into Gomoku game and our endeavors to make the origin algorithm more efficient. And most of our adaptations and improvements are focused on making origin tree policy and default policy accommodate Gomoku game better and generate better decisions with limited time budget Tree Policy with Advice In the origin version of UCT algorithm illustrated by Browne et al. Its Tree Policy tries to explore all the possible actions (or moves in board game) when a node isn t fully expanded and its expand algorithm simply randomly pick the next node among the node which haven t been explored.

6 This method may work fine with a narrow range of next actions and huge time budget. However in our Gomoku game it isn t the case. To make our Tree Policy more efficient, we ve introduced a coach system and piece influence map. Firstly, for the coach system, when there are moves which guarantee lead to win if the player takes it or to lose if the player doesn t take it, we should focus more on these moves. Secondly, when there are no suggestions provided by coach, the moves near the locations where current pieces are placed should be focused. Since in Gomoku game, a piece placed far away from other pieces will have very small short term contribution for the game. To represent this feature, we give each black and white piece the same origin weight of 1.0, the whole board makes an 11x11 matrix. We then apply a convolution to this matrix using a 5x5 gaussian-like kernel. Make the value of locations which are already being occupied zero and then we have the matrix for the possibility distribution of next expand choice. Following two images shows the explore probability distribution for the next move, The dark blue means the probability to explore that move is nearly zero and the yellow blocks means the probability to explore that location is higher. Fig. 3.3 From game state to explore probability In our final version of tree policy, we ve kept both of these ideas. And as the suggestions generated by SimpleAI is usually smarter than the moves randomly picked on probability distribution map, thus we ve set higher priority for SimpleAI. Our pseudo-code is shown in fig 3.4. In the pseudo code, \ represents Set Difference. Function RandMove(A) is a function which return a uniformly random pick among A and function RandMoveProbMap(A, M) returns a random pick amone A, where the probability is subject to the probability distribution map M. One more thing need to be mention here is that some of the moves on probability map M may have already been explored, however we are not interested in expanding them again. Thus we have to set these explored moves probability to zero and renormalize M before picking next move.

7 function TreePolicy(v) while v is nonterminal do suggestions <- suggestions generated by SimpleAi if suggestions is not empty do unexplored <- suggestions \ v.children if unexplored is not empty do v <- RandMove(unexplored) return ExpandNode(v, v ) else prob_map <- GenerateProbMap(v.state) possible_moves <- moves with non-zero probability on prob map unexplored <- possible_moves \ v.children if unexplored is not empty do v <- RandMoveProbMap(unexplored, prob_map) return ExpandNode(v, v ) v <- BestChild(v,Cp) return v function ExpandNode(v, v ) add the new child v to v return v Fig. 3.4 Tree Policy Pseudo Code Smarter Default Policy In the origin algorithm, the default policy also simply take random actions from the possible action set. However this may not be a good way in our game as the state space is huge and the terminate state space is relatively small. Thus we ve introduced our Gomoku SimpleAI which can generate a move which is smarter than the one randomly picked. This method will generate a better result because in real Gomoku game, both of the players are trying to win, and the moves generated by SimpleAI is more player-like and thus the estimated reward value is more accurate. However such method also has a flaw. In a Gomoku game, there are multiple methods to attract, some are defendable while others are not. Fig 3.4.a shows a situation is pretty easy to defend if white place piece at (2,6) or (6,2) while in 3.4.b, no matter what white do, it will lose if black keeping attack 3->4 and 4->5. Fig. 3.4.a Defendable attack (Black) Fig. 3.4.b Undefendable attack (Black) Though the second attack is much better, but the defendable attack move is sometimes also a good move since your opponent may fails to detect that and falls into your trap. However Simple AI is smart enough

8 to detect and defend all of the defendable state, then the reward for defendable attack would be zero as the default policy runs. A smarter opponent will make our AI be less aggressive and make less attack moves, that is what we don't want. Then we introduced our final method, the method which makes it possible for both sides to make mistakes. And it is actually very easy to implement, the only thing we add is to for each move of default policy, we give both side a possibility to make random choice rather than use SimpleAI. After this adjust we ve found that the reward of defendable attack moves becomes higher in search tree. As its reward is still lower than undefendable attack moves, thus undefendable attack moves still have higher priority and would be picked first Backward Propagation with Decay One more difference between our algorithm and the origin algorithm is that we ve add decay for the backward propagation (BACKUP function). As everytime the default policy has reached a terminal state, the reward wound be back propagate to all its ancestors. However, instead of direct pass this reward value, this value is multiplied by a constant decay value (<1.0, what we choose is 0.85) on each level. The reason we ve add this decay value is because in our game, short term reward is always more promising than long term reward. Assume you are a player who really want win the game, and there are two moves provided for you, the first one has 80% probability to win within three rounds and the other has 80% probability to win within ten rounds. A rational player usually tends to choose the first choice. Also in a adversal game, the long term future could be hard to predict with high accuracy. Thus we ve add this rule. 4. Test Results To prove the effectiveness of our algorithm we did following experiments: 1. Let AI play against another AI with different computational limit. 2. Let AI play against different people. 4.1 Experiment between two Gomoku AIs In this experiment we let two Gomoku AIs play against with each other. Let these two AIs be player1 and player2, we have made five groups of experiments with five different time bases, which are 0.2s, 0.4s, 0.8s, 1.6s and 3.2s, where player1 uses the time base as its computation time budget and player2 doubles the budget of player1. When doing the experiments, each player take turns to use black pieces and white pieces. For example, in the first group of experiments, there are totally 400 games where player1 uses white pieces for 200 games and use black pieces for 200 games to ensure the sequence does not influence the result (as in Gomoku rules, black side should always drop piece first). Our results are shown in Table 4.1 and Fig 4.1, 4.2.

9 Table 4.1 Winning rate of alternative playing in different limit n games Player1 budget (ms) Player2 budget (ms) Player 1 win rate (%) Player 2 win rate (%) Draw rate (%) Fig 4.1 Graph of computational limit vs player win rate Fig 4.2 Graph of computational limit vs draw rate

10 One thing is pretty clear we can found in table 4.1, fig 4.1 and fig 4.2 is that the draw rate drops dramatically as the base run time increases. It happens because as the computation budge increases, the tree search algorithm will have more time to explore deeper into the deeper nodes, which represent the moves into further future of the game. The outcome is that the smarter moves would be found as the algorithm considers more into further future and these moves will create traps which are harder to detect by the opponent and defend these moves. The result is the game tends to ends in fewer steps and thus less draw games, where every empty location is occupied and no one wins. The second thing we ve found is that player2 always has higher win rate than player1, this result met our expectation pretty well since with larger computation budget, the online UCT training algorithm will have more time to reach more precise results. As the base time increases, the difference between win rates actually decreases. There are two reasons may lead it to happen. The first is that though the reward and its UCB and LCB are continuous, the final result of the online UCT training algorithm is always a single location on the board, which is a discrete value. With larger computation budget, these estimated values (reward, UCB & LCB) could be more precise. But there may exist a threshold when the budget is over the threshold, the results is already precise enough to generate a rational enough discrete decision. Then higher precision may no longer make the decision much better when budget is beyond that threshold. Also as we ve introduced SimpleAI and explore probability distribution map, these features will helps the online training to converge faster. In a learning based system, after the result convergence, more training usually won t cause obvious improvements on the results. 4.2 Experiment between Human and Gomoku AI In the end, we let the AI play with different level of Gomoku hobbyist and amateur. Results in Table 4.2 shows that our AI can sometimes beat human players. It can even sometimes win our best Gomoku hobbyist Player in our team. Table 4.2 Results of human player against AI Human Player Human Player Level AI Time Budget (s) Number of Games Human Wins AI Wins AI Winning rate (%) A Intermediate B Intermediate C High D Expert For the results in table 4.2 we can draw the conclusion that our Gomoku AI does have the ability to win human players sometimes. It does generate pretty good moves under some circumstances which really surprise the human player. And we do believe our UCT Gomoku algorithm has the potential to totally win general human players and we are all willing to make refinements and let our Gomoku better in the future.

11 5. References [1] L. V. Allis et al. Searching for solutions in games and artificial intelligence. Ponsen & Looijen, [2] C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton. A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games, 4(1):1 43, , 2 [3] R. Coulom. Efficient selectivity and backup operators in monte-carlo tree search. International conference on computers and games, pages Springer, [4] D. E. Knuth and R. W. Moore. An analysis of alpha-beta pruning. Artificial intelligence, 6(4): , [5] L. Kocsis and C. Szepesv ari. Bandit based monte-carlo planning. In ECML, volume 6, pages Springer, [6] L. Kocsis, C. Szepesv ari, and J. Willemson. Improved monte-carlo search. Univ. Tartu, Estonia, Tech. Rep, 1, , 2 [7] J. Schaeffer. The history heuristic and alpha-beta search enhancements in practice. IEEE transactions on pattern analysis and machine intelligence, 11(11): , [8] J. Ẃagner and I. Virag. Solving renju. In ICGA journal. Citeseer, [9] Silver, David. Reinforcement learning and simulation-based search in computer Go. ProQuest Dissertations Publishing, NR [10] Gelly, Sylvain, and David Silver. Combining online and offline knowledge in UCT. Proceedings of the 24th international conference on Machine learning. ACM, [11] Silver, David, et al. Mastering the game of Go with deep neural networks and tree search. Nature (2016): [12] Silver, David, et al. Mastering the game of Go without human knowledge. Nature (2017): [13] Wikipedia. Gomoku - Wikipedia, the free encyclopedia, [Online; accessed 01-Dec-2017] [14] Wikipedia. Q-learning - Wikipedia, the free encyclopedia, [Online; accessed 03-Dec-2017] [15] Wikipedia. Monte Carlo tree search - Wikipedia, the free encyclopedia, [Online; accessed 05-Dec-2017]

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage Richard Kelly and David Churchill Computer Science Faculty of Science Memorial University {richard.kelly, dchurchill}@mun.ca

More information

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH Santiago Ontañón so367@drexel.edu Recall: Adversarial Search Idea: When there is only one agent in the world, we can solve problems using DFS, BFS, ID,

More information

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar Monte Carlo Tree Search and AlphaGo Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar Zero-Sum Games and AI A player s utility gain or loss is exactly balanced by the combined gain or loss of opponents:

More information

CS221 Project Final Report Gomoku Game Agent

CS221 Project Final Report Gomoku Game Agent CS221 Project Final Report Gomoku Game Agent Qiao Tan qtan@stanford.edu Xiaoti Hu xiaotihu@stanford.edu 1 Introduction Gomoku, also know as five-in-a-row, is a strategy board game which is traditionally

More information

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009 By David Anderson SZTAKI (Budapest, Hungary) WPI D2009 1997, Deep Blue won against Kasparov Average workstation can defeat best Chess players Computer Chess no longer interesting Go is much harder for

More information

Adversarial Reasoning: Sampling-Based Search with the UCT algorithm. Joint work with Raghuram Ramanujan and Ashish Sabharwal

Adversarial Reasoning: Sampling-Based Search with the UCT algorithm. Joint work with Raghuram Ramanujan and Ashish Sabharwal Adversarial Reasoning: Sampling-Based Search with the UCT algorithm Joint work with Raghuram Ramanujan and Ashish Sabharwal Upper Confidence bounds for Trees (UCT) n The UCT algorithm (Kocsis and Szepesvari,

More information

More on games (Ch )

More on games (Ch ) More on games (Ch. 5.4-5.6) Alpha-beta pruning Previously on CSci 4511... We talked about how to modify the minimax algorithm to prune only bad searches (i.e. alpha-beta pruning) This rule of checking

More information

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence Adversarial Search CS 486/686: Introduction to Artificial Intelligence 1 Introduction So far we have only been concerned with a single agent Today, we introduce an adversary! 2 Outline Games Minimax search

More information

A Bandit Approach for Tree Search

A Bandit Approach for Tree Search A An Example in Computer-Go Department of Statistics, University of Michigan March 27th, 2008 A 1 Bandit Problem K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem 2 Classical Tree Search UCT Algorithm

More information

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46.

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46. Foundations of Artificial Intelligence May 30, 2016 46. AlphaGo and Outlook Foundations of Artificial Intelligence 46. AlphaGo and Outlook Thomas Keller Universität Basel May 30, 2016 46.1 Introduction

More information

Available online at ScienceDirect. Procedia Computer Science 62 (2015 ) 31 38

Available online at  ScienceDirect. Procedia Computer Science 62 (2015 ) 31 38 Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 62 (2015 ) 31 38 The 2015 International Conference on Soft Computing and Software Engineering (SCSE 2015) Analysis of a

More information

Game-playing: DeepBlue and AlphaGo

Game-playing: DeepBlue and AlphaGo Game-playing: DeepBlue and AlphaGo Brief history of gameplaying frontiers 1990s: Othello world champions refuse to play computers 1994: Chinook defeats Checkers world champion 1997: DeepBlue defeats world

More information

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence Adversarial Search CS 486/686: Introduction to Artificial Intelligence 1 AccessAbility Services Volunteer Notetaker Required Interested? Complete an online application using your WATIAM: https://york.accessiblelearning.com/uwaterloo/

More information

Five-In-Row with Local Evaluation and Beam Search

Five-In-Row with Local Evaluation and Beam Search Five-In-Row with Local Evaluation and Beam Search Jiun-Hung Chen and Adrienne X. Wang jhchen@cs axwang@cs Abstract This report provides a brief overview of the game of five-in-row, also known as Go-Moku,

More information

AI Approaches to Ultimate Tic-Tac-Toe

AI Approaches to Ultimate Tic-Tac-Toe AI Approaches to Ultimate Tic-Tac-Toe Eytan Lifshitz CS Department Hebrew University of Jerusalem, Israel David Tsurel CS Department Hebrew University of Jerusalem, Israel I. INTRODUCTION This report is

More information

More on games (Ch )

More on games (Ch ) More on games (Ch. 5.4-5.6) Announcements Midterm next Tuesday: covers weeks 1-4 (Chapters 1-4) Take the full class period Open book/notes (can use ebook) ^^ No programing/code, internet searches or friends

More information

2048: An Autonomous Solver

2048: An Autonomous Solver 2048: An Autonomous Solver Final Project in Introduction to Artificial Intelligence ABSTRACT. Our goal in this project was to create an automatic solver for the wellknown game 2048 and to analyze how different

More information

Monte Carlo Tree Search. Simon M. Lucas

Monte Carlo Tree Search. Simon M. Lucas Monte Carlo Tree Search Simon M. Lucas Outline MCTS: The Excitement! A tutorial: how it works Important heuristics: RAVE / AMAF Applications to video games and real-time control The Excitement Game playing

More information

CS 387: GAME AI BOARD GAMES

CS 387: GAME AI BOARD GAMES CS 387: GAME AI BOARD GAMES 5/28/2015 Instructor: Santiago Ontañón santi@cs.drexel.edu Class website: https://www.cs.drexel.edu/~santi/teaching/2015/cs387/intro.html Reminders Check BBVista site for the

More information

Monte Carlo Tree Search

Monte Carlo Tree Search Monte Carlo Tree Search 1 By the end, you will know Why we use Monte Carlo Search Trees The pros and cons of MCTS How it is applied to Super Mario Brothers and Alpha Go 2 Outline I. Pre-MCTS Algorithms

More information

Learning from Hints: AI for Playing Threes

Learning from Hints: AI for Playing Threes Learning from Hints: AI for Playing Threes Hao Sheng (haosheng), Chen Guo (cguo2) December 17, 2016 1 Introduction The highly addictive stochastic puzzle game Threes by Sirvo LLC. is Apple Game of the

More information

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol Google DeepMind s AlphaGo vs. world Go champion Lee Sedol Review of Nature paper: Mastering the game of Go with Deep Neural Networks & Tree Search Tapani Raiko Thanks to Antti Tarvainen for some slides

More information

Monte-Carlo Tree Search and Minimax Hybrids with Heuristic Evaluation Functions

Monte-Carlo Tree Search and Minimax Hybrids with Heuristic Evaluation Functions Monte-Carlo Tree Search and Minimax Hybrids with Heuristic Evaluation Functions Hendrik Baier and Mark H.M. Winands Games and AI Group, Department of Knowledge Engineering Faculty of Humanities and Sciences,

More information

Building Opening Books for 9 9 Go Without Relying on Human Go Expertise

Building Opening Books for 9 9 Go Without Relying on Human Go Expertise Journal of Computer Science 8 (10): 1594-1600, 2012 ISSN 1549-3636 2012 Science Publications Building Opening Books for 9 9 Go Without Relying on Human Go Expertise 1 Keh-Hsun Chen and 2 Peigang Zhang

More information

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta Computer Go: from the Beginnings to AlphaGo Martin Müller, University of Alberta 2017 Outline of the Talk Game of Go Short history - Computer Go from the beginnings to AlphaGo The science behind AlphaGo

More information

CS-E4800 Artificial Intelligence

CS-E4800 Artificial Intelligence CS-E4800 Artificial Intelligence Jussi Rintanen Department of Computer Science Aalto University March 9, 2017 Difficulties in Rational Collective Behavior Individual utility in conflict with collective

More information

Programming an Othello AI Michael An (man4), Evan Liang (liange)

Programming an Othello AI Michael An (man4), Evan Liang (liange) Programming an Othello AI Michael An (man4), Evan Liang (liange) 1 Introduction Othello is a two player board game played on an 8 8 grid. Players take turns placing stones with their assigned color (black

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April AlphaZero

TTIC 31230, Fundamentals of Deep Learning David McAllester, April AlphaZero TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 AlphaZero 1 AlphaGo Fan (October 2015) AlphaGo Defeats Fan Hui, European Go Champion. 2 AlphaGo Lee (March 2016) 3 AlphaGo Zero vs.

More information

The game of Reversi was invented around 1880 by two. Englishmen, Lewis Waterman and John W. Mollett. It later became

The game of Reversi was invented around 1880 by two. Englishmen, Lewis Waterman and John W. Mollett. It later became Reversi Meng Tran tranm@seas.upenn.edu Faculty Advisor: Dr. Barry Silverman Abstract: The game of Reversi was invented around 1880 by two Englishmen, Lewis Waterman and John W. Mollett. It later became

More information

Playing Othello Using Monte Carlo

Playing Othello Using Monte Carlo June 22, 2007 Abstract This paper deals with the construction of an AI player to play the game Othello. A lot of techniques are already known to let AI players play the game Othello. Some of these techniques

More information

Heuristic Move Pruning in Monte Carlo Tree Search for the Strategic Card Game Lords of War

Heuristic Move Pruning in Monte Carlo Tree Search for the Strategic Card Game Lords of War Heuristic Move Pruning in Monte Carlo Tree Search for the Strategic Card Game Lords of War Nick Sephton, Peter I. Cowling, Edward Powley, and Nicholas H. Slaven York Centre for Complex Systems Analysis,

More information

The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games

The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games Proceedings of the Ninth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games Santiago

More information

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask Set 4: Game-Playing ICS 271 Fall 2017 Kalev Kask Overview Computer programs that play 2-player games game-playing as search with the complication of an opponent General principles of game-playing and search

More information

Game Playing for a Variant of Mancala Board Game (Pallanguzhi)

Game Playing for a Variant of Mancala Board Game (Pallanguzhi) Game Playing for a Variant of Mancala Board Game (Pallanguzhi) Varsha Sankar (SUNet ID: svarsha) 1. INTRODUCTION Game playing is a very interesting area in the field of Artificial Intelligence presently.

More information

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( )

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( ) COMP3211 Project Artificial Intelligence for Tron game Group 7 Chiu Ka Wa (20369737) Chun Wai Wong (20265022) Ku Chun Kit (20123470) Abstract Tron is an old and popular game based on a movie of the same

More information

SEARCHING is both a method of solving problems and

SEARCHING is both a method of solving problems and 100 IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 3, NO. 2, JUNE 2011 Two-Stage Monte Carlo Tree Search for Connect6 Shi-Jim Yen, Member, IEEE, and Jung-Kuei Yang Abstract Recently,

More information

Automatic Game AI Design by the Use of UCT for Dead-End

Automatic Game AI Design by the Use of UCT for Dead-End Automatic Game AI Design by the Use of UCT for Dead-End Zhiyuan Shi, Yamin Wang, Suou He*, Junping Wang*, Jie Dong, Yuanwei Liu, Teng Jiang International School, School of Software Engineering* Beiing

More information

A Study of UCT and its Enhancements in an Artificial Game

A Study of UCT and its Enhancements in an Artificial Game A Study of UCT and its Enhancements in an Artificial Game David Tom and Martin Müller Department of Computing Science, University of Alberta, Edmonton, Canada, T6G 2E8 {dtom, mmueller}@cs.ualberta.ca Abstract.

More information

CS 4700: Foundations of Artificial Intelligence

CS 4700: Foundations of Artificial Intelligence CS 4700: Foundations of Artificial Intelligence selman@cs.cornell.edu Module: Adversarial Search R&N: Chapter 5 1 Outline Adversarial Search Optimal decisions Minimax α-β pruning Case study: Deep Blue

More information

Application of UCT Search to the Connection Games of Hex, Y, *Star, and Renkula!

Application of UCT Search to the Connection Games of Hex, Y, *Star, and Renkula! Application of UCT Search to the Connection Games of Hex, Y, *Star, and Renkula! Tapani Raiko and Jaakko Peltonen Helsinki University of Technology, Adaptive Informatics Research Centre, P.O. Box 5400,

More information

CMSC 671 Project Report- Google AI Challenge: Planet Wars

CMSC 671 Project Report- Google AI Challenge: Planet Wars 1. Introduction Purpose The purpose of the project is to apply relevant AI techniques learned during the course with a view to develop an intelligent game playing bot for the game of Planet Wars. Planet

More information

CS 387/680: GAME AI BOARD GAMES

CS 387/680: GAME AI BOARD GAMES CS 387/680: GAME AI BOARD GAMES 6/2/2014 Instructor: Santiago Ontañón santi@cs.drexel.edu TA: Alberto Uriarte office hours: Tuesday 4-6pm, Cyber Learning Center Class website: https://www.cs.drexel.edu/~santi/teaching/2014/cs387-680/intro.html

More information

Virtual Global Search: Application to 9x9 Go

Virtual Global Search: Application to 9x9 Go Virtual Global Search: Application to 9x9 Go Tristan Cazenave LIASD Dept. Informatique Université Paris 8, 93526, Saint-Denis, France cazenave@ai.univ-paris8.fr Abstract. Monte-Carlo simulations can be

More information

43.1 Introduction. Foundations of Artificial Intelligence Introduction Monte-Carlo Methods Monte-Carlo Tree Search. 43.

43.1 Introduction. Foundations of Artificial Intelligence Introduction Monte-Carlo Methods Monte-Carlo Tree Search. 43. May 6, 20 3. : Introduction 3. : Introduction Malte Helmert University of Basel May 6, 20 3. Introduction 3.2 3.3 3. Summary May 6, 20 / 27 May 6, 20 2 / 27 Board Games: Overview 3. : Introduction Introduction

More information

Procedural Play Generation According to Play Arcs Using Monte-Carlo Tree Search

Procedural Play Generation According to Play Arcs Using Monte-Carlo Tree Search Proc. of the 18th International Conference on Intelligent Games and Simulation (GAME-ON'2017), Carlow, Ireland, pp. 67-71, Sep. 6-8, 2017. Procedural Play Generation According to Play Arcs Using Monte-Carlo

More information

An Intelligent Agent for Connect-6

An Intelligent Agent for Connect-6 An Intelligent Agent for Connect-6 Sagar Vare, Sherrie Wang, Andrea Zanette {svare, sherwang, zanette}@stanford.edu Institute for Computational and Mathematical Engineering Huang Building 475 Via Ortega

More information

SCRABBLE ARTIFICIAL INTELLIGENCE GAME. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University

SCRABBLE ARTIFICIAL INTELLIGENCE GAME. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University SCRABBLE AI GAME 1 SCRABBLE ARTIFICIAL INTELLIGENCE GAME CS 297 Report Presented to Dr. Chris Pollett Department of Computer Science San Jose State University In Partial Fulfillment Of the Requirements

More information

CS 229 Final Project: Using Reinforcement Learning to Play Othello

CS 229 Final Project: Using Reinforcement Learning to Play Othello CS 229 Final Project: Using Reinforcement Learning to Play Othello Kevin Fry Frank Zheng Xianming Li ID: kfry ID: fzheng ID: xmli 16 December 2016 Abstract We built an AI that learned to play Othello.

More information

Exploration exploitation in Go: UCT for Monte-Carlo Go

Exploration exploitation in Go: UCT for Monte-Carlo Go Exploration exploitation in Go: UCT for Monte-Carlo Go Sylvain Gelly(*) and Yizao Wang(*,**) (*)TAO (INRIA), LRI, UMR (CNRS - Univ. Paris-Sud) University of Paris-Sud, Orsay, France sylvain.gelly@lri.fr

More information

Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN

Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN Weijie Chen Fall 2017 Weijie Chen Page 1 of 7 1. INTRODUCTION Game TEN The traditional game Tic-Tac-Toe enjoys people s favor. Moreover,

More information

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Tom Pepels June 19, 2012 Abstract In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man.

More information

Monte-Carlo Tree Search and Minimax Hybrids

Monte-Carlo Tree Search and Minimax Hybrids Monte-Carlo Tree Search and Minimax Hybrids Hendrik Baier and Mark H.M. Winands Games and AI Group, Department of Knowledge Engineering Faculty of Humanities and Sciences, Maastricht University Maastricht,

More information

Using Genetic Programming to Evolve Heuristics for a Monte Carlo Tree Search Ms Pac-Man Agent

Using Genetic Programming to Evolve Heuristics for a Monte Carlo Tree Search Ms Pac-Man Agent Using Genetic Programming to Evolve Heuristics for a Monte Carlo Tree Search Ms Pac-Man Agent Atif M. Alhejali, Simon M. Lucas School of Computer Science and Electronic Engineering University of Essex

More information

Pruning playouts in Monte-Carlo Tree Search for the game of Havannah

Pruning playouts in Monte-Carlo Tree Search for the game of Havannah Pruning playouts in Monte-Carlo Tree Search for the game of Havannah Joris Duguépéroux, Ahmad Mazyad, Fabien Teytaud, Julien Dehos To cite this version: Joris Duguépéroux, Ahmad Mazyad, Fabien Teytaud,

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Jeff Clune Assistant Professor Evolving Artificial Intelligence Laboratory AI Challenge One 140 Challenge 1 grades 120 100 80 60 AI Challenge One Transform to graph Explore the

More information

Monte-Carlo Tree Search Enhancements for Havannah

Monte-Carlo Tree Search Enhancements for Havannah Monte-Carlo Tree Search Enhancements for Havannah Jan A. Stankiewicz, Mark H.M. Winands, and Jos W.H.M. Uiterwijk Department of Knowledge Engineering, Maastricht University j.stankiewicz@student.maastrichtuniversity.nl,

More information

Playout Search for Monte-Carlo Tree Search in Multi-Player Games

Playout Search for Monte-Carlo Tree Search in Multi-Player Games Playout Search for Monte-Carlo Tree Search in Multi-Player Games J. (Pim) A.M. Nijssen and Mark H.M. Winands Games and AI Group, Department of Knowledge Engineering, Faculty of Humanities and Sciences,

More information

CS229 Project: Building an Intelligent Agent to play 9x9 Go

CS229 Project: Building an Intelligent Agent to play 9x9 Go CS229 Project: Building an Intelligent Agent to play 9x9 Go Shawn Hu Abstract We build an AI to autonomously play the board game of Go at a low amateur level. Our AI uses the UCT variation of Monte Carlo

More information

Improving Monte Carlo Tree Search Policies in StarCraft via Probabilistic Models Learned from Replay Data

Improving Monte Carlo Tree Search Policies in StarCraft via Probabilistic Models Learned from Replay Data Proceedings, The Twelfth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-16) Improving Monte Carlo Tree Search Policies in StarCraft via Probabilistic Models Learned

More information

Training a Back-Propagation Network with Temporal Difference Learning and a database for the board game Pente

Training a Back-Propagation Network with Temporal Difference Learning and a database for the board game Pente Training a Back-Propagation Network with Temporal Difference Learning and a database for the board game Pente Valentijn Muijrers 3275183 Valentijn.Muijrers@phil.uu.nl Supervisor: Gerard Vreeswijk 7,5 ECTS

More information

CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions Slides by Svetlana Lazebnik, 9/2016 Modified by Mark Hasegawa Johnson, 9/2017 Types of game environments Perfect

More information

Monte Carlo Tree Search and Related Algorithms for Games

Monte Carlo Tree Search and Related Algorithms for Games 25 Monte Carlo Tree Search and Related Algorithms for Games Nathan R. Sturtevant 25.1 Introduction 25.2 Background 25.3 Algorithm 1: Online UCB1 25.4 Algorithm 2: Regret Matching 25.5 Algorithm 3: Offline

More information

Recent Progress in Computer Go. Martin Müller University of Alberta Edmonton, Canada

Recent Progress in Computer Go. Martin Müller University of Alberta Edmonton, Canada Recent Progress in Computer Go Martin Müller University of Alberta Edmonton, Canada 40 Years of Computer Go 1960 s: initial ideas 1970 s: first serious program - Reitman & Wilcox 1980 s: first PC programs,

More information

Creating a Havannah Playing Agent

Creating a Havannah Playing Agent Creating a Havannah Playing Agent B. Joosten August 27, 2009 Abstract This paper delves into the complexities of Havannah, which is a 2-person zero-sum perfectinformation board game. After determining

More information

Feature Learning Using State Differences

Feature Learning Using State Differences Feature Learning Using State Differences Mesut Kirci and Jonathan Schaeffer and Nathan Sturtevant Department of Computing Science University of Alberta Edmonton, Alberta, Canada {kirci,nathanst,jonathan}@cs.ualberta.ca

More information

Game Playing AI. Dr. Baldassano Yu s Elite Education

Game Playing AI. Dr. Baldassano Yu s Elite Education Game Playing AI Dr. Baldassano chrisb@princeton.edu Yu s Elite Education Last 2 weeks recap: Graphs Graphs represent pairwise relationships Directed/undirected, weighted/unweights Common algorithms: Shortest

More information

Early Playout Termination in MCTS

Early Playout Termination in MCTS Early Playout Termination in MCTS Richard Lorentz (B) Department of Computer Science, California State University, Northridge, CA 91330-8281, USA lorentz@csun.edu Abstract. Many researchers view mini-max

More information

game tree complete all possible moves

game tree complete all possible moves Game Trees Game Tree A game tree is a tree the nodes of which are positions in a game and edges are moves. The complete game tree for a game is the game tree starting at the initial position and containing

More information

Score Bounded Monte-Carlo Tree Search

Score Bounded Monte-Carlo Tree Search Score Bounded Monte-Carlo Tree Search Tristan Cazenave and Abdallah Saffidine LAMSADE Université Paris-Dauphine Paris, France cazenave@lamsade.dauphine.fr Abdallah.Saffidine@gmail.com Abstract. Monte-Carlo

More information

Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! Due (in dropbox) Tuesday, September 23, 9:34am

Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! Due (in dropbox) Tuesday, September 23, 9:34am Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! Due (in dropbox) Tuesday, September 23, 9:34am The purpose of this assignment is to program some of the search algorithms

More information

arxiv: v1 [cs.ai] 9 Aug 2012

arxiv: v1 [cs.ai] 9 Aug 2012 Experiments with Game Tree Search in Real-Time Strategy Games Santiago Ontañón Computer Science Department Drexel University Philadelphia, PA, USA 19104 santi@cs.drexel.edu arxiv:1208.1940v1 [cs.ai] 9

More information

Automated Suicide: An Antichess Engine

Automated Suicide: An Antichess Engine Automated Suicide: An Antichess Engine Jim Andress and Prasanna Ramakrishnan 1 Introduction Antichess (also known as Suicide Chess or Loser s Chess) is a popular variant of chess where the objective of

More information

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Tom Pepels Mark H.M. Winands Abstract In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man.

More information

CS 5522: Artificial Intelligence II

CS 5522: Artificial Intelligence II CS 5522: Artificial Intelligence II Adversarial Search Instructor: Alan Ritter Ohio State University [These slides were adapted from CS188 Intro to AI at UC Berkeley. All materials available at http://ai.berkeley.edu.]

More information

AI Plays Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng)

AI Plays Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng) AI Plays 2048 Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng) Abstract The strategy game 2048 gained great popularity quickly. Although it is easy to play, people cannot win the game easily,

More information

Documentation and Discussion

Documentation and Discussion 1 of 9 11/7/2007 1:21 AM ASSIGNMENT 2 SUBJECT CODE: CS 6300 SUBJECT: ARTIFICIAL INTELLIGENCE LEENA KORA EMAIL:leenak@cs.utah.edu Unid: u0527667 TEEKO GAME IMPLEMENTATION Documentation and Discussion 1.

More information

Experiments on Alternatives to Minimax

Experiments on Alternatives to Minimax Experiments on Alternatives to Minimax Dana Nau University of Maryland Paul Purdom Indiana University April 23, 1993 Chun-Hung Tzeng Ball State University Abstract In the field of Artificial Intelligence,

More information

MONTE-CARLO TWIXT. Janik Steinhauer. Master Thesis 10-08

MONTE-CARLO TWIXT. Janik Steinhauer. Master Thesis 10-08 MONTE-CARLO TWIXT Janik Steinhauer Master Thesis 10-08 Thesis submitted in partial fulfilment of the requirements for the degree of Master of Science of Artificial Intelligence at the Faculty of Humanities

More information

An AI for Dominion Based on Monte-Carlo Methods

An AI for Dominion Based on Monte-Carlo Methods An AI for Dominion Based on Monte-Carlo Methods by Jon Vegard Jansen and Robin Tollisen Supervisors: Morten Goodwin, Associate Professor, Ph.D Sondre Glimsdal, Ph.D Fellow June 2, 2014 Abstract To the

More information

Optimal Yahtzee performance in multi-player games

Optimal Yahtzee performance in multi-player games Optimal Yahtzee performance in multi-player games Andreas Serra aserra@kth.se Kai Widell Niigata kaiwn@kth.se April 12, 2013 Abstract Yahtzee is a game with a moderately large search space, dependent on

More information

UNIT 13A AI: Games & Search Strategies

UNIT 13A AI: Games & Search Strategies UNIT 13A AI: Games & Search Strategies 1 Artificial Intelligence Branch of computer science that studies the use of computers to perform computational processes normally associated with human intellect

More information

Adversary Search. Ref: Chapter 5

Adversary Search. Ref: Chapter 5 Adversary Search Ref: Chapter 5 1 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans is possible. Many games can be modeled very easily, although

More information

UNIT 13A AI: Games & Search Strategies. Announcements

UNIT 13A AI: Games & Search Strategies. Announcements UNIT 13A AI: Games & Search Strategies 1 Announcements Do not forget to nominate your favorite CA bu emailing gkesden@gmail.com, No lecture on Friday, no recitation on Thursday No office hours Wednesday,

More information

Monte Carlo based battleship agent

Monte Carlo based battleship agent Monte Carlo based battleship agent Written by: Omer Haber, 313302010; Dror Sharf, 315357319 Introduction The game of battleship is a guessing game for two players which has been around for almost a century.

More information

Comparing UCT versus CFR in Simultaneous Games

Comparing UCT versus CFR in Simultaneous Games Comparing UCT versus CFR in Simultaneous Games Mohammad Shafiei Nathan Sturtevant Jonathan Schaeffer Computing Science Department University of Alberta {shafieik,nathanst,jonathan}@cs.ualberta.ca Abstract

More information

Optimizing UCT for Settlers of Catan

Optimizing UCT for Settlers of Catan Optimizing UCT for Settlers of Catan Gabriel Rubin Bruno Paz Felipe Meneguzzi Pontifical Catholic University of Rio Grande do Sul, Computer Science Department, Brazil A BSTRACT Settlers of Catan is one

More information

Adversarial Search (Game Playing)

Adversarial Search (Game Playing) Artificial Intelligence Adversarial Search (Game Playing) Chapter 5 Adapted from materials by Tim Finin, Marie desjardins, and Charles R. Dyer Outline Game playing State of the art and resources Framework

More information

For slightly more detailed instructions on how to play, visit:

For slightly more detailed instructions on how to play, visit: Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! The purpose of this assignment is to program some of the search algorithms and game playing strategies that we have learned

More information

Analyzing the Impact of Knowledge and Search in Monte Carlo Tree Search in Go

Analyzing the Impact of Knowledge and Search in Monte Carlo Tree Search in Go Analyzing the Impact of Knowledge and Search in Monte Carlo Tree Search in Go Farhad Haqiqat and Martin Müller University of Alberta Edmonton, Canada Contents Motivation and research goals Feature Knowledge

More information

Programming Project 1: Pacman (Due )

Programming Project 1: Pacman (Due ) Programming Project 1: Pacman (Due 8.2.18) Registration to the exams 521495A: Artificial Intelligence Adversarial Search (Min-Max) Lectured by Abdenour Hadid Adjunct Professor, CMVS, University of Oulu

More information

ARTIFICIAL INTELLIGENCE (CS 370D)

ARTIFICIAL INTELLIGENCE (CS 370D) Princess Nora University Faculty of Computer & Information Systems ARTIFICIAL INTELLIGENCE (CS 370D) (CHAPTER-5) ADVERSARIAL SEARCH ADVERSARIAL SEARCH Optimal decisions Min algorithm α-β pruning Imperfect,

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 7: Minimax and Alpha-Beta Search 2/9/2011 Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein 1 Announcements W1 out and due Monday 4:59pm P2

More information

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters CS 188: Artificial Intelligence Spring 2011 Announcements W1 out and due Monday 4:59pm P2 out and due next week Friday 4:59pm Lecture 7: Mini and Alpha-Beta Search 2/9/2011 Pieter Abbeel UC Berkeley Many

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Adversarial Search Prof. Scott Niekum The University of Texas at Austin [These slides are based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

A Parallel Monte-Carlo Tree Search Algorithm

A Parallel Monte-Carlo Tree Search Algorithm A Parallel Monte-Carlo Tree Search Algorithm Tristan Cazenave and Nicolas Jouandeau LIASD, Université Paris 8, 93526, Saint-Denis, France cazenave@ai.univ-paris8.fr n@ai.univ-paris8.fr Abstract. Monte-Carlo

More information

Monte Carlo Approaches to Parameterized Poker Squares

Monte Carlo Approaches to Parameterized Poker Squares Computer Science Faculty Publications Computer Science 6-29-2016 Monte Carlo Approaches to Parameterized Poker Squares Todd W. Neller Gettysburg College Zuozhi Yang Gettysburg College Colin M. Messinger

More information

CS 771 Artificial Intelligence. Adversarial Search

CS 771 Artificial Intelligence. Adversarial Search CS 771 Artificial Intelligence Adversarial Search Typical assumptions Two agents whose actions alternate Utility values for each agent are the opposite of the other This creates the adversarial situation

More information

CS188 Spring 2014 Section 3: Games

CS188 Spring 2014 Section 3: Games CS188 Spring 2014 Section 3: Games 1 Nearly Zero Sum Games The standard Minimax algorithm calculates worst-case values in a zero-sum two player game, i.e. a game in which for all terminal states s, the

More information

CSC321 Lecture 23: Go

CSC321 Lecture 23: Go CSC321 Lecture 23: Go Roger Grosse Roger Grosse CSC321 Lecture 23: Go 1 / 21 Final Exam Friday, April 20, 9am-noon Last names A Y: Clara Benson Building (BN) 2N Last names Z: Clara Benson Building (BN)

More information

Adversarial Game Playing Using Monte Carlo Tree Search. A thesis submitted to the

Adversarial Game Playing Using Monte Carlo Tree Search. A thesis submitted to the Adversarial Game Playing Using Monte Carlo Tree Search A thesis submitted to the Department of Electrical Engineering and Computing Systems of the University of Cincinnati in partial fulfillment of the

More information