Monte Carlo Tree Search Method for AI Games

Size: px
Start display at page:

Download "Monte Carlo Tree Search Method for AI Games"

Transcription

1 Monte Carlo Tree Search Method for AI Games 1 Tejaswini Patil, 2 Kalyani Amrutkar, 3 Dr. P. K. Deshmukh 1,2 Pune University, JSPM, Rajashri Shahu College of Engineering, Tathawade, Pune 3 JSPM, Rajashri Shahu College of Engineering, Tathawade, Pune Abstract: Old methods to game AI require either a high quality of domain knowledge, or a long time to generate effective AI behavior. These two characteristics hamper the goal of establishing challenging game AI. In this paper, we put forward Monte-Carlo Tree Search as a complete framework to game AI. In the framework, randomized explorations of the search space are used to predict the most promising game actions. In this report it s demonstrate that Monte-Carlo Tree Search can be applied effectively to a. classic board-games, b. modern board-games, and c. video games. Different algorithms which required implementing these games are also explained in this paper. Different algorithms presented which handles different set of information. Few Games are also explained here to demonstrate actions of algorithms. Keywords: Artificial Intelligence, Monte Carlo methods, Game AI 1. INTRODUCTION When implementing AI for computer games, the most important factor is the evaluation function that estimates the quality of a game state. The classic approach is to use heuristic domain knowledge to establish such estimates. However, building an adequate evaluation function based on heuristic knowledge for a non-terminal game state is a domain dependant and complex task. It probably is one of the main reasons why game AI in complex gameenvironments did not achieve a strong level, despite intensive research and additional use of knowledge-based methods. In the last few years, several Monte-Carlo based techniques emerged in the field of computer games. They have already been applied successfully too many games. Monte-Carlo Tree Search (MCTS), a Monte-Carlo based technique that was first established in 2006, is implemented in top-rated GO programs. These programs defeated for the 9 9 board. However, the technique is not specific to GO or classical board games, but can be generalized easily to modern board games or video games. Furthermore, its implementation is quite straightforward. In the proposed demonstration, we will illustrate that MCTS can be applied effectively to (1) classic board-games (such as GO), (2) modern board-games (such as SETTLERS OF CATAN), and (3) video games (such as the SPRING RTS game). In this paper, we studied several variants of MCTS for games of imperfect information. Determinization are a popular approach to such games, but this approach has several shortcomings. Two such shortcomings are: strategy fusion (assuming the ability to make different decisions from different future states in the same information set) and nonlocality (ignoring the ability of other players to direct play towards some states in information set and away from others). For MCTS, a third shortcoming is that the computational budget must be shared between searching several independent trees rather than devoting all iterations to exploring deeply a single tree. Some techniques which require to solve some problems in MCTS algorithm are not discussed in this paper. 2. REVIEW BACKGROUND 2.1 Monte Carlo Tree Search MONTE CARLO TREE SEARCH (MCTS) methods have gained popularity in recent years due to success in domains such as Computer Go. In particular, the upper confidence bound for trees (UCT) algorithm forms the basis of successful MCTS applications across a wide variety of domains. Many of the domains in which MCTS has proved successful, including Go, were considered challenging for the application of traditional AI techniques (such as minimax search with pruning), particularly due to the difficulty of forecasting the winner from a nonterminal game state. MCTS has several strengths. It requires little domain knowledge, although including domain knowledge can be beneficial. It is an anytime algorithm, able to produce well results with as much or as little computational time as is available. It also lends itself to parallel execution. In this paper we will investigate the application of MCTS to games of imperfect information. In particular, we consider games which have three different types of imperfect information. Information sets are collections of states, which appear in a game when one player has knowledge about the state that another player does not. For example, in a card game each player hides his own cards from his opponents. In this example, the information set contains all states which correspond to all possible permutations of opponent cards. A player knows which information set they are in, but not which state within that information set. Volume 2, Issue 2 March April 2013 Page 405

2 Partially observable moves appear in games where a player performs an action but some detail of the action is hidden from an opponent. Simultaneous moves arise when multiple players reveal decisions simultaneously without knowing what decision the other players have made. The effect of the decisions is resolved simultaneously. The well-known game of Rock- Paper-Scissors is an example of this. Monte-Carlo Tree Search (MCTS) is a class of game tree search algorithms that have recently proven successful for deterministic games of perfect information, particularly the game of Go. Determinization is an AI technique for making decisions in games of imperfect information by analyzing several instances of the equivalent deterministic game of perfect information. In this paper we combine determinization techniques with MCTS for the popular Chinese card game Dou Di Zhu, Lord of the Rings, Phantom 4, 4, 4 Game. In determinized MCTS, there is a trade-off between the number of determinizations searched and the time spent searching each one; however, we show that this trade-off does not significantly affect the performance of determinized MCTS, as long as both quantities are sufficiently large. MCTS algorithms build a subtree of the entire decision tree where usually one new node is added after every simulation. Each node stores estimates of the rewards obtained by selecting each action and an improved estimate is available after every simulation step. Each decision in the tree is treated as a multiarmed bandit problem where the arms are actions, and the rewards are the results of performing a Monte Carlo simulation after selecting that action. MCTS is an anytime algorithm, requiring little domain knowledge. 2.2 Structure of MCTS The structure of MCTS algorithm is generally quite similar: a discrete number of iterations are performed, after which an action from the root node is selected according to statistics collected about each action. Monte- Carlo Tree Search (MCTS) as shown in Fig. 1 is a bestfirst search technique which uses stochastic simulations. MCTS can be applied to any game of finite length. Its basis is the simulation of games where both the AI controlled player and its opponents play random moves, or, better, pseudo-random moves. From a single random game (where every player selects his actions randomly), very little can be learnt. But from simulating a multitude of random games, a good strategy can be inferred[1]. Each step performs four operations on sub-tree built by algorithm, namely Selection, Expansion, Simulation and Back-Propagation. Selection: While the state is found in the tree, the next action is chosen according to the statistics stored, in a way that balances between exploitation and exploration. On the one hand, the task is often to select the game action that leads to the best results so far (exploitation). On the other hand, less promising actions still have to be explored, due to the uncertainty of the evaluation (exploration). Expansion: When the game reaches the cannot be found in the tree, the state is added as a new node. This way, the tree is expanded by one node for each simulated game. Simulation: For the rest of the game, actions are selected at random until the end of the game. Naturally, the adequate weighting of action selection probabilities has a signi are selected with equal probability, then the strategy played is often weak, and the level of the Monte-Carlo program is suboptimal. We can use heuristic knowledge to give larger weights to actions that look more promising. Backpropagation: After reaching the end of the simulated game, we update each tree node that was traversed during that game. The visit counts are increased and the win/loss ratio is modi outcome. The game action program in the actual game, is the one corresponding to the child which was explored the most 2.3 Node Selection Figure 1 Basic Algorithm of MCTS In this section, we first present the node selection process which happens from in basic algorithm explained in above section. Node selection during tree descent is achieved by choosing the node that maximises some quantity, analogous to the multiarmed bandit problem in which a player must choose the slot machine that maximises the estimated reward each turn. An Upper Confidence Bounds (UCB) formula is typically used for such Node Selection[3]. UCB formula used is: Where vi is the estimated value of the node, ni is the number of the times the node has been visited and N is the total number of times that its parent has been visited. C is a tunable bias parameter. The UCB formula balances the exploitation of known rewards with the exploration of relatively unvisited nodes to encourage their exercise. Reward estimates are based Volume 2, Issue 2 March April 2013 Page 406

3 on random simulations, so nodes must be visited a number of times before these estimates become reliable; MCTS estimates will typically be unreliable at the start of a search but converge to more reliable estimates given sufficient time and perfect estimates given infinite time. 2.4 Benefits of MCTS MCTS offer several benefits over the traditional tree search algorithms. Few of them are listed below: 1. Aheuristic: MCTS does not require any strategic or tactical knowledge about the given domain to make reasonable decisions. The algorithm can function effectively with no knowledge of a game apart from its legal moves and end conditions; this means that a single MCTS implementation can be reused for a number of games with little modification, and makes MCTS a potential boon for general game playing. 2. Asymmetric: MCTS performs asymmetric tree growth that adapts to the topology of the search space. The algorithm visits more interesting nodes more often, and focuses its search time in more relevant parts of the tree. This makes MCTS suitable for games with large branching factors such as 19x19 Go. Such large combinatorial spaces typically cause problems for standard depth- or breadth-based search methods, but the adaptive nature of MCTS means that it will (eventually) find those moves that appear optimal and focus its search effort there. 3. Anytime: The algorithm can be halted at any time to return the current best estimate. The search tree built thus far may be discarded or preserved for future reuse. 4. Elegant: The algorithm is simple to implement 2.5 Drawbacks of MCTS Algorithms MCTS has few drawbacks which can be major. 1. Playing Strength: The MCTS algorithm, in its basic form, can fail to find reasonable moves for even games of medium complexity within a reasonable amount of time. This is mostly due to the sheer size of the combinatorial move space and the fact that key nodes may not be visited enough times to give reliable estimates. 2. Speed: MCTS search can take much iteration to converge to a good solution, which can be an issue for more general applications that are difficult to optimize. For example, the best Go implementations can require millions of playouts in conjunction with domain specific optimizations and enhancements to make expert moves, whereas the best GGP implementations may only make tens of (domain independent) playouts per second for more complex games. For a reasonable move times, such GGPs may barely have time to visit each legal move and it is unlikely that significant search will occur. 2.6 Improvements over the drawbacks There are many MCTS enhancements have been suggested to date. These can generally be described as being either Domain Knowledge or Domain Independent. 1. Domain Knowledge: Domain knowledge specific to the current game can be exploited in the tree to filter out implausible moves or in the simulations to produce heavy playouts that are more similar to playouts that would occur between human opponents. This means that playout results will be more realistic than random simulations and those nodes will require less iteration to yield realistic reward values. Domain knowledge can yield significant improvements, at the expense of speed and loss of generality. 2. Domain Independent: Domain independent enhancements apply to all problem domains. These are typically applied in the tree (e.g. AMAF) although again some apply to the simulations (e.g. prefer winning moves during playouts). Domain independent enhancements do not tie the implementation to a particular domain, maintaining generality, and are hence the focus of most current work in the area. 3. DESCRIPTION OF ALGORITHMS We will try to study two types of algorithm in this paper which elaborate MCTS. A. Single Observer Information Set Monte Carlo Tree Search (SO-ISMCTS) To overcome the problems associated with other approaches of MCTS algorithm used in AI Games here we are proposing searching a single tree whose nodes corresponds to information sets and not states. In SO- ISMCTS nodes in the tree correspond to information sets from the root player s point of view, and edges correspond to action[8]s (i.e., moves from the point of view of the player who plays them). The correspondence between nodes and information sets is not one one: partially observable opponent moves that are indistinguishable to the root player have separate edges in the tree, and thus the resulting information set has several nodes in the tree. Figure 2 Single Observer Information Set Monte Carlo Tree Search Figure 2 shows a game tree for a simple single-player game of imperfect information. The root information set contains two states: x and y. The player first selects one of two actions: a1 or a2. Selecting a2 yields an immediate reward of +0.5 and ends the game. If the player instead selects a1, he must then select an action a3 or a4. If the game began in state x, then a3 and a4 lead to rewards of - 1 and +1, respectively. If the game began in state y, then the rewards are interchanged. Volume 2, Issue 2 March April 2013 Page 407

4 If states x and y is equally likely, action a1 has an expectimax value of 0: upon choosing, both a3 and a4 have an expectimax value of 0. Thus, the optimal action from the root is a2. However, a determinizing player searches trees corresponding to each state x and y individually and assigns a minimax value of +1 in each (by assuming that the correct choice of a3 or a4 can always be made), thus believing a1 to be optimal. A simple algorithm/high level pseudocode explains SO- ISMCTS is given below: 1. Create a single tree with some root corresponding to root information 2. Do n number of iterations for node/action //Selection 3. Select a node and traverse until node reach to some action which results in some points or rewards //Expansion 4. If selected node is not a terminal node add a new node and expand tree //Simulation 5. On the last node run a simulation to the end of the game using determinization algorithm //Backpropagation 6. Return an action from root node such that the number of visits to the corresponding child node is optimal B. Multiple Observer Information Set Monte Carlo Tree Search (MO-ISMCTS) SO-ISMCTS + POM solve the strategy fusion problem of SO-ISMCTS, at the expense of significantly weakening the opponent model: in particular, it is assumed that the opponent chooses randomly between actions that are indistinguishable to the root player. In the extreme case, when SO-ISMCTS + POM is applied to a phantom game all opponent actions are indistinguishable and so the opponent model is essentially random. To address this, we propose multiple-observer information set MCTS (MO-ISMCTS). This algorithm maintains a separate tree for each player, whose nodes correspond to that player s information sets and whose edges correspond to moves from that player s point of view. Each iteration of the algorithm descends all of the trees simultaneously. Each selection step uses statistics in the tree belonging to the player about to act in the current determinization to select an action. Each tree is then descended by following the branch corresponding to the move obtained when the corresponding player observes the selected action, adding new branches if necessary. The information set trees can be seen as projections of the underlying game tree. Each iteration induces a path through the game tree, which projects onto a path through each information set tree. 4. GAMES A. Lord of the Rings The Lord of the Rings is a board game with elements similar to Stratego and has several features which make it even more challenging from an AI perspective. It has hidden information, partially observable moves, and simultaneous moves, all of which make the decision making process highly complex. The game also has an asymmetry between the two players since they have different win conditions and different resources available to them, which necessitates different tactics and strategies. In this experiment, the following algorithms play in a round-robin tournament: UCT, SO-ISMCTS, SO- ISMCTS + POM, and MO-ISMCTS. Each algorithm runs for iterations per decision. Determinized UCT uses ten determinizations with 1000 iterations for the Dark player, and applies all iterations to a single determinization for the Light. These values were chosen based on the results in Section V-B. Cheating ensemble UCT uses ten trees with 1000 iterations each for both Light and Dark; devoting all iterations to a single tree would be equivalent to cheating single-tree UCT. B. Phantom 4, 4, 4 Games Phantom is also known as m, n,k games are a generalization of games such as Noughts and Crosses and Renju where players try to place k pieces in a row on an m x n grid. We will investigate the phantom 4, 4, 4-game where players cannot see each other s pieces. This leads to a game with hidden information and partially observable moves 5. EXPERIMENTAL RESULT FOR DOU DI ZHU GAME A. Background Dou Di Zhu is played among three people with one pack of cards, including the two jokers. The game starts with players bidding for the Landlord position. Those who lose the bid enter the game as the team competing against the Landlord. The objective of the game is to be the first player to have no cards left. The game was only played in a few regions of China until quite recently, when versions of the game on the Internet have led to an increase in the popularity of the game throughout the whole country. Today Dou Di Zhu is played by millions of people online, although almost exclusively in China, with one website reporting players per hour. In addition, there have been several major Dou Di Zhu tournaments including one in 2008 which attracted players. B. Game Play A shuffled pack of 54 cards is dealt to three players. Each player is dealt 17 cards each, with the last three leftover "kitty" cards detained on the playing desk, face down. Volume 2, Issue 2 March April 2013 Page 408

5 All players first review and appraise their own cards without showing their cards to the other players. Then, players take turns to bid for the Landlord position by telling the other players the risk stake they are willing to accept. There are three kinds of risk stakes, 1, 2, and 3, with 1 being the lowest and 3 being the highest. Generally, the more confident a player is in the strength of one's cards, the higher the risk stakes one is willing to bid. In most of the online game rooms, the first bidder is chosen randomly by the system. In reality, players usually make up their own rules as to who gets to bid first. A player may accept the prior player's bid by passing their turn to bid or one may try to outbid the prior player as long as the prior player did not bet 3 as the risk stake. In other words, 1 can be outbid by 2 or 3; 2 can only be outbid by 3; and 3 cannot be outbid. The highest bidder takes the Landlord position; and the remaining players enter the Farmer team competing against the Landlord. The three leftover wild cards are then revealed to all players before dealt to the Landlord. The Landlord wins if he or she has no cards left. The Farmer team wins if either of the Farmers has no cards left. C. Rules Dou Di Zhu uses a standard 52 card deck with the addition of a black joker and a red joker. Suit is irrelevant but the cards are ranked in ascending order. A bidding phase, which is not considered here, designates one of the players as the Landlord. The Landlord receives 20 cards dealt from a shuffled deck, while the other players receive 17 each. The goal of the game is to be the first to get rid of all cards in hand. If the Landlord wins, the other two players must each pay the stake to the Landlord. However, if either of the other two players wins, the Landlord pays the stake to both opponents. This means the two non-landlord players must cooperate to beat the Landlord. The non-landlord players do not see each other s cards, so the game cannot be reduced to a twoplayer game with perfect recall. Card play takes place in a number of rounds until a player has no cards left. The Landlord begins the game by making a leading play, which can be any group of cards from their hand provided this group is a member of one of the legal move categories. The next player can play a group of cards from their hand provided this group is in the same category and has a higher rank than the group played by the previous player, or may pass. A player who holds no compatible group has no choice but to pass. This continues until two players pass, at which point the next player may start a new round by making a new leading play of any category. D. Implementation The branching factor for leading plays is typically around 40, and for nonleading plays is much smaller. However, in situations where moves with kickers are available each combination of move and kicker must be considered as a separate move, leading to a combinatorial explosion in the branching factor for leading plays. It should be noted that this is a problem specific to Dou Di Zhu caused by the game mechanic of being able to attach kicker cards to a play. To restructure this, an approach used similar to the move grouping approach of Childs et al the player first chooses the base move and then the kicker, as two separate consecutive decision nodes in the tree. The overall win rate for determinized UCT was 43.6%, for ISMCTS it was 42.3%, and for cheating UCT it was 56.5%. The win rates are approximately the same as those we previously obtained. This is unsurprising: the 1000 deals we originally selected were chosen to be a good indicator of typical playing strength. Each deal was then put into one of three categories according to the difference in win rate between cheating UCT and determinized UCT. If cheating UCT outperformed determinized UCT (with 95% significance) the deal was put into the category. If determinized UCT outperformed cheating UCT (also with 95% significance) the deal was put into the category. 6. THE OBJECTIVE: HANDLING UNCERTAINTY IN GAME The objective of the new algorithm outperforms to handle uncertainty in games and hidden information. Below are the two techniques which help to avoid drawback of the current algorithm and achieve object of the current algorithm. A. Simultaneous Moves: Simultaneous moves are a special case of imperfect information, in which each player independently chooses an action and these actions are applied at the same time. Simultaneous moves can be modelled by having players choose their actions sequentially, while hiding their choices from the other players, until finally an environment action reveals the chosen actions and resolves their effects. With this in mind, any algorithm that can handle imperfect information in general can handle simultaneous moves in particular. However, some of our algorithms (particularly those not designed to handle partially observable moves) perform poorly using this model. Under a simple determinization approach, the first player is overly pessimistic (assuming the opponent can observe the chosen move and select the best response to it) while the second player is overly optimistic (assuming the first player s move is fixed at the point of the second player s decision, and thus determinizing it randomly). For this reason, we add a mechanism to the algorithms studied in this paper specifically to handle simultaneous moves. The UCT algorithm has been applied to the simultaneous move game Rock-Paper-Scissors, using an approach where each player s choice of action is treated Volume 2, Issue 2 March April 2013 Page 409

6 as a separate independent multiarmed bandit problem. In other words, instead of selecting player 1 s move, descending the corresponding tree branch, and selecting player 2 s move from the resulting child node, both moves are selected independently from the same node and the tree branch corresponding to the resulting pair of moves is descended. B. Chance Nodes Handling of chance events is not a primary focus of this paper. However, chance nodes do occur under certain circumstances in one of our test domains, so they cannot be ignored completely. Note that our chance nodes have a small number of possible outcomes (at most four but rarely more than two), all with equal probability. Technically, another test domain includes a chance event with combinatorial many outcomes corresponding to shuffling and dealing a deck of cards at the beginning of the game, but since this occurs before any player has made a decision it never occurs as a chance node in our search tree. Consider a chance node with branches. To ensure that each branch is explored approximately equally, the first visits select all outcomes in a random permutation; the second visits select all outcomes in another random permutation, and so on. This is almost trivial to implement in UCT: since we already use UCB with random tie-breaking for action selection, it suffices to treat the environment player as a decision-making agent who has perfect information and receives a reward of zero for all terminal states. The UCB exploration term then ensures that the branches are visited in the manner described above. 7. FUTURE WORK The ISMCTS family of algorithms for several domains demonstrated here. It is clear that an enhanced version of ISMCTS should yield better playing strength, especially for domains such as Dou Di Zhu where there is a need for some mechanism to handle the large branching factor at opponent nodes. It remains to establish the theoretical properties of these algorithms and their potential for converging to game-theoretic solutions. MO-ISMCTS is arguably the most theoretically defensible of the three ISMCTS algorithms as it most accurately models the differences in information available to each player. A subject for future work is to conduct a full theoretical analysis of MO-ISMCTS, and investigate the situations under which it converges to an optimal policy. The SO- ISMCTS + POM algorithm currently assumes the opponent chooses indistinguishable moves at random, which is clearly incorrect as a decision model for the opponent. There is room for improvement in this aspect of the algorithm. REFERENCES [2] Chang Liu and Andrew D. Tremblay Monte-Carlo Serach Algorithm March 28, 2011 [3] Monte Carlo Tree Search [Online] Available: ml [4] Fight the landlor(dou Di Zhu) [Online] Available: rd_report.pdf [5] E. K. P. Chong, R. L. Givan, and H. S. Chang, A framework for simulation-based network control via hindsight optimization, in Proc.IEEE Conf. Decision Control, Sydney, Australia, 2000, pp [6] S. J. Russell and P.Norvig, Artificial Intelligence: AModern Approach, 3rd ed. Upper Saddle River, NJ: Prentice-Hall, [7] Board Game Geek, Stratego, 2011 [Online]. Available: Powley, E.J. ; Whitehouse, D. Information set monte carlo tree search IEEE Conf. UK Volume 2 Issue 4, June 2012 [8] Algorithms for abstracting and solving imperfect information games[online] Available: [9] The Monte Carlo Method for Game AI, blog [Online] Available: [10] D. Whitehouse, E. J. Powley, and P. I. Cowling, Determinization and information set Monte Carlo tree search for the card game Dou Di Zhu, in Proc. IEEE Conf. Comput. Intell. Games, Seoul, Korea, 2011, pp AUTHORS Dr. Pradeep K. Deshmukh, B.E, M.E & Ph.D in Computer Science and Engineering. His key research interest include Cloud computing, Network Security, ANN. He is currently working as Professor in Rajarshi Shahu College of Engineering, Pune, India Mrs. Tejaswini Patil, BE Computers. She has over 6 years of Industrial Experience currently pursuing ME from Rajarshi Shahu College of Enginnering, Pune, India Mrs. Kalyani Amrutkar, BE Computers from Pune University, Currently pursuing her ME from Pune University and working as Lecturer in reputed institute. [1] Determinization in Monte-Carlo Tree Search for the card game Dou Di Zhu [Online]. Available: Volume 2, Issue 2 March April 2013 Page 410

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage Richard Kelly and David Churchill Computer Science Faculty of Science Memorial University {richard.kelly, dchurchill}@mun.ca

More information

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar Monte Carlo Tree Search and AlphaGo Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar Zero-Sum Games and AI A player s utility gain or loss is exactly balanced by the combined gain or loss of opponents:

More information

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH Santiago Ontañón so367@drexel.edu Recall: Adversarial Search Idea: When there is only one agent in the world, we can solve problems using DFS, BFS, ID,

More information

More on games (Ch )

More on games (Ch ) More on games (Ch. 5.4-5.6) Alpha-beta pruning Previously on CSci 4511... We talked about how to modify the minimax algorithm to prune only bad searches (i.e. alpha-beta pruning) This rule of checking

More information

Adversarial Reasoning: Sampling-Based Search with the UCT algorithm. Joint work with Raghuram Ramanujan and Ashish Sabharwal

Adversarial Reasoning: Sampling-Based Search with the UCT algorithm. Joint work with Raghuram Ramanujan and Ashish Sabharwal Adversarial Reasoning: Sampling-Based Search with the UCT algorithm Joint work with Raghuram Ramanujan and Ashish Sabharwal Upper Confidence bounds for Trees (UCT) n The UCT algorithm (Kocsis and Szepesvari,

More information

An AI for Dominion Based on Monte-Carlo Methods

An AI for Dominion Based on Monte-Carlo Methods An AI for Dominion Based on Monte-Carlo Methods by Jon Vegard Jansen and Robin Tollisen Supervisors: Morten Goodwin, Associate Professor, Ph.D Sondre Glimsdal, Ph.D Fellow June 2, 2014 Abstract To the

More information

CS 387: GAME AI BOARD GAMES

CS 387: GAME AI BOARD GAMES CS 387: GAME AI BOARD GAMES 5/28/2015 Instructor: Santiago Ontañón santi@cs.drexel.edu Class website: https://www.cs.drexel.edu/~santi/teaching/2015/cs387/intro.html Reminders Check BBVista site for the

More information

Monte Carlo tree search techniques in the game of Kriegspiel

Monte Carlo tree search techniques in the game of Kriegspiel Monte Carlo tree search techniques in the game of Kriegspiel Paolo Ciancarini and Gian Piero Favini University of Bologna, Italy 22 IJCAI, Pasadena, July 2009 Agenda Kriegspiel as a partial information

More information

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence Adversarial Search CS 486/686: Introduction to Artificial Intelligence 1 Introduction So far we have only been concerned with a single agent Today, we introduce an adversary! 2 Outline Games Minimax search

More information

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13 Algorithms for Data Structures: Search for Games Phillip Smith 27/11/13 Search for Games Following this lecture you should be able to: Understand the search process in games How an AI decides on the best

More information

More on games (Ch )

More on games (Ch ) More on games (Ch. 5.4-5.6) Announcements Midterm next Tuesday: covers weeks 1-4 (Chapters 1-4) Take the full class period Open book/notes (can use ebook) ^^ No programing/code, internet searches or friends

More information

CS-E4800 Artificial Intelligence

CS-E4800 Artificial Intelligence CS-E4800 Artificial Intelligence Jussi Rintanen Department of Computer Science Aalto University March 9, 2017 Difficulties in Rational Collective Behavior Individual utility in conflict with collective

More information

Monte Carlo Tree Search for games with Hidden Information and Uncertainty. Daniel Whitehouse PhD University of York Computer Science

Monte Carlo Tree Search for games with Hidden Information and Uncertainty. Daniel Whitehouse PhD University of York Computer Science Monte Carlo Tree Search for games with Hidden Information and Uncertainty Daniel Whitehouse PhD University of York Computer Science August, 2014 Abstract Monte Carlo Tree Search (MCTS) is an AI technique

More information

A Bandit Approach for Tree Search

A Bandit Approach for Tree Search A An Example in Computer-Go Department of Statistics, University of Michigan March 27th, 2008 A 1 Bandit Problem K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem 2 Classical Tree Search UCT Algorithm

More information

Monte Carlo Tree Search. Simon M. Lucas

Monte Carlo Tree Search. Simon M. Lucas Monte Carlo Tree Search Simon M. Lucas Outline MCTS: The Excitement! A tutorial: how it works Important heuristics: RAVE / AMAF Applications to video games and real-time control The Excitement Game playing

More information

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence Adversarial Search CS 486/686: Introduction to Artificial Intelligence 1 AccessAbility Services Volunteer Notetaker Required Interested? Complete an online application using your WATIAM: https://york.accessiblelearning.com/uwaterloo/

More information

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009 By David Anderson SZTAKI (Budapest, Hungary) WPI D2009 1997, Deep Blue won against Kasparov Average workstation can defeat best Chess players Computer Chess no longer interesting Go is much harder for

More information

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask Set 4: Game-Playing ICS 271 Fall 2017 Kalev Kask Overview Computer programs that play 2-player games game-playing as search with the complication of an opponent General principles of game-playing and search

More information

43.1 Introduction. Foundations of Artificial Intelligence Introduction Monte-Carlo Methods Monte-Carlo Tree Search. 43.

43.1 Introduction. Foundations of Artificial Intelligence Introduction Monte-Carlo Methods Monte-Carlo Tree Search. 43. May 6, 20 3. : Introduction 3. : Introduction Malte Helmert University of Basel May 6, 20 3. Introduction 3.2 3.3 3. Summary May 6, 20 / 27 May 6, 20 2 / 27 Board Games: Overview 3. : Introduction Introduction

More information

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game Outline Game Playing ECE457 Applied Artificial Intelligence Fall 2007 Lecture #5 Types of games Playing a perfect game Minimax search Alpha-beta pruning Playing an imperfect game Real-time Imperfect information

More information

BLUFF WITH AI. CS297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University. In Partial Fulfillment

BLUFF WITH AI. CS297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University. In Partial Fulfillment BLUFF WITH AI CS297 Report Presented to Dr. Chris Pollett Department of Computer Science San Jose State University In Partial Fulfillment Of the Requirements for the Class CS 297 By Tina Philip May 2017

More information

Game-playing: DeepBlue and AlphaGo

Game-playing: DeepBlue and AlphaGo Game-playing: DeepBlue and AlphaGo Brief history of gameplaying frontiers 1990s: Othello world champions refuse to play computers 1994: Chinook defeats Checkers world champion 1997: DeepBlue defeats world

More information

Monte Carlo Tree Search

Monte Carlo Tree Search Monte Carlo Tree Search 1 By the end, you will know Why we use Monte Carlo Search Trees The pros and cons of MCTS How it is applied to Super Mario Brothers and Alpha Go 2 Outline I. Pre-MCTS Algorithms

More information

TRIAL-BASED HEURISTIC TREE SEARCH FOR FINITE HORIZON MDPS. Thomas Keller and Malte Helmert Presented by: Ryan Berryhill

TRIAL-BASED HEURISTIC TREE SEARCH FOR FINITE HORIZON MDPS. Thomas Keller and Malte Helmert Presented by: Ryan Berryhill TRIAL-BASED HEURISTIC TREE SEARCH FOR FINITE HORIZON MDPS Thomas Keller and Malte Helmert Presented by: Ryan Berryhill Outline Motivation Background THTS framework THTS algorithms Results Motivation Advances

More information

Using Fictitious Play to Find Pseudo-Optimal Solutions for Full-Scale Poker

Using Fictitious Play to Find Pseudo-Optimal Solutions for Full-Scale Poker Using Fictitious Play to Find Pseudo-Optimal Solutions for Full-Scale Poker William Dudziak Department of Computer Science, University of Akron Akron, Ohio 44325-4003 Abstract A pseudo-optimal solution

More information

SCRABBLE ARTIFICIAL INTELLIGENCE GAME. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University

SCRABBLE ARTIFICIAL INTELLIGENCE GAME. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University SCRABBLE AI GAME 1 SCRABBLE ARTIFICIAL INTELLIGENCE GAME CS 297 Report Presented to Dr. Chris Pollett Department of Computer Science San Jose State University In Partial Fulfillment Of the Requirements

More information

Small and large MCTS playouts applied to Chinese Dark Chess stochastic game

Small and large MCTS playouts applied to Chinese Dark Chess stochastic game Small and large MCTS playouts applied to Chinese Dark Chess stochastic game Nicolas Jouandeau 1 and Tristan Cazenave 2 1 LIASD, Université de Paris 8, France n@ai.univ-paris8.fr 2 LAMSADE, Université Paris-Dauphine,

More information

CS188: Artificial Intelligence, Fall 2011 Written 2: Games and MDP s

CS188: Artificial Intelligence, Fall 2011 Written 2: Games and MDP s CS88: Artificial Intelligence, Fall 20 Written 2: Games and MDP s Due: 0/5 submitted electronically by :59pm (no slip days) Policy: Can be solved in groups (acknowledge collaborators) but must be written

More information

Artificial Intelligence. Minimax and alpha-beta pruning

Artificial Intelligence. Minimax and alpha-beta pruning Artificial Intelligence Minimax and alpha-beta pruning In which we examine the problems that arise when we try to plan ahead to get the best result in a world that includes a hostile agent (other agent

More information

SEARCHING is both a method of solving problems and

SEARCHING is both a method of solving problems and 100 IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 3, NO. 2, JUNE 2011 Two-Stage Monte Carlo Tree Search for Connect6 Shi-Jim Yen, Member, IEEE, and Jung-Kuei Yang Abstract Recently,

More information

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1 Unit-III Chap-II Adversarial Search Created by: Ashish Shah 1 Alpha beta Pruning In case of standard ALPHA BETA PRUNING minimax tree, it returns the same move as minimax would, but prunes away branches

More information

AI Plays Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng)

AI Plays Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng) AI Plays 2048 Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng) Abstract The strategy game 2048 gained great popularity quickly. Although it is easy to play, people cannot win the game easily,

More information

DeepStack: Expert-Level AI in Heads-Up No-Limit Poker. Surya Prakash Chembrolu

DeepStack: Expert-Level AI in Heads-Up No-Limit Poker. Surya Prakash Chembrolu DeepStack: Expert-Level AI in Heads-Up No-Limit Poker Surya Prakash Chembrolu AI and Games AlphaGo Go Watson Jeopardy! DeepBlue -Chess Chinook -Checkers TD-Gammon -Backgammon Perfect Information Games

More information

CS 387/680: GAME AI BOARD GAMES

CS 387/680: GAME AI BOARD GAMES CS 387/680: GAME AI BOARD GAMES 6/2/2014 Instructor: Santiago Ontañón santi@cs.drexel.edu TA: Alberto Uriarte office hours: Tuesday 4-6pm, Cyber Learning Center Class website: https://www.cs.drexel.edu/~santi/teaching/2014/cs387-680/intro.html

More information

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I Adversarial Search and Game- Playing C H A P T E R 6 C M P T 3 1 0 : S P R I N G 2 0 1 1 H A S S A N K H O S R A V I Adversarial Search Examine the problems that arise when we try to plan ahead in a world

More information

CPS331 Lecture: Search in Games last revised 2/16/10

CPS331 Lecture: Search in Games last revised 2/16/10 CPS331 Lecture: Search in Games last revised 2/16/10 Objectives: 1. To introduce mini-max search 2. To introduce the use of static evaluation functions 3. To introduce alpha-beta pruning Materials: 1.

More information

CS188 Spring 2014 Section 3: Games

CS188 Spring 2014 Section 3: Games CS188 Spring 2014 Section 3: Games 1 Nearly Zero Sum Games The standard Minimax algorithm calculates worst-case values in a zero-sum two player game, i.e. a game in which for all terminal states s, the

More information

MONTE-CARLO TWIXT. Janik Steinhauer. Master Thesis 10-08

MONTE-CARLO TWIXT. Janik Steinhauer. Master Thesis 10-08 MONTE-CARLO TWIXT Janik Steinhauer Master Thesis 10-08 Thesis submitted in partial fulfilment of the requirements for the degree of Master of Science of Artificial Intelligence at the Faculty of Humanities

More information

The first topic I would like to explore is probabilistic reasoning with Bayesian

The first topic I would like to explore is probabilistic reasoning with Bayesian Michael Terry 16.412J/6.834J 2/16/05 Problem Set 1 A. Topics of Fascination The first topic I would like to explore is probabilistic reasoning with Bayesian nets. I see that reasoning under situations

More information

Advanced Game AI. Level 6 Search in Games. Prof Alexiei Dingli

Advanced Game AI. Level 6 Search in Games. Prof Alexiei Dingli Advanced Game AI Level 6 Search in Games Prof Alexiei Dingli MCTS? MCTS Based upon Selec=on Expansion Simula=on Back propaga=on Enhancements The Mul=- Armed Bandit Problem At each step pull one arm Noisy/random

More information

Monte-Carlo Tree Search for the Simultaneous Move Game Tron

Monte-Carlo Tree Search for the Simultaneous Move Game Tron Monte-Carlo Tree Search for the Simultaneous Move Game Tron N.G.P. Den Teuling June 27, 2011 Abstract Monte-Carlo Tree Search (MCTS) has been successfully applied to many games, particularly in Go. In

More information

Game-Playing & Adversarial Search

Game-Playing & Adversarial Search Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search (two lectures) Chapter 5.1-5.5 Next lecture topic: Constraint Satisfaction Problems (two lectures) Chapter 6.1-6.4,

More information

game tree complete all possible moves

game tree complete all possible moves Game Trees Game Tree A game tree is a tree the nodes of which are positions in a game and edges are moves. The complete game tree for a game is the game tree starting at the initial position and containing

More information

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game? CSC384: Introduction to Artificial Intelligence Generalizing Search Problem Game Tree Search Chapter 5.1, 5.2, 5.3, 5.6 cover some of the material we cover here. Section 5.6 has an interesting overview

More information

COMP219: Artificial Intelligence. Lecture 13: Game Playing

COMP219: Artificial Intelligence. Lecture 13: Game Playing CMP219: Artificial Intelligence Lecture 13: Game Playing 1 verview Last time Search with partial/no observations Belief states Incremental belief state search Determinism vs non-determinism Today We will

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Adversarial Search Vibhav Gogate The University of Texas at Dallas Some material courtesy of Rina Dechter, Alex Ihler and Stuart Russell, Luke Zettlemoyer, Dan Weld Adversarial

More information

Five-In-Row with Local Evaluation and Beam Search

Five-In-Row with Local Evaluation and Beam Search Five-In-Row with Local Evaluation and Beam Search Jiun-Hung Chen and Adrienne X. Wang jhchen@cs axwang@cs Abstract This report provides a brief overview of the game of five-in-row, also known as Go-Moku,

More information

Intuition Mini-Max 2

Intuition Mini-Max 2 Games Today Saying Deep Blue doesn t really think about chess is like saying an airplane doesn t really fly because it doesn t flap its wings. Drew McDermott I could feel I could smell a new kind of intelligence

More information

Game Specific Approaches to Monte Carlo Tree Search for Dots and Boxes

Game Specific Approaches to Monte Carlo Tree Search for Dots and Boxes Western Kentucky University TopSCHOLAR Honors College Capstone Experience/Thesis Projects Honors College at WKU 6-28-2017 Game Specific Approaches to Monte Carlo Tree Search for Dots and Boxes Jared Prince

More information

CS 771 Artificial Intelligence. Adversarial Search

CS 771 Artificial Intelligence. Adversarial Search CS 771 Artificial Intelligence Adversarial Search Typical assumptions Two agents whose actions alternate Utility values for each agent are the opposite of the other This creates the adversarial situation

More information

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search COMP19: Artificial Intelligence COMP19: Artificial Intelligence Dr. Annabel Latham Room.05 Ashton Building Department of Computer Science University of Liverpool Lecture 1: Game Playing 1 Overview Last

More information

Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku

Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku Guanlin Zhou (gz2250), Nan Yu (ny2263), Yanqing Dai (yd2369), Yingtao Zhong (yz3276) 1. Introduction: Reinforcement Learning for Gomoku

More information

5.4 Imperfect, Real-Time Decisions

5.4 Imperfect, Real-Time Decisions 5.4 Imperfect, Real-Time Decisions Searching through the whole (pruned) game tree is too inefficient for any realistic game Moves must be made in a reasonable amount of time One has to cut off the generation

More information

Learning from Hints: AI for Playing Threes

Learning from Hints: AI for Playing Threes Learning from Hints: AI for Playing Threes Hao Sheng (haosheng), Chen Guo (cguo2) December 17, 2016 1 Introduction The highly addictive stochastic puzzle game Threes by Sirvo LLC. is Apple Game of the

More information

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Tom Pepels Mark H.M. Winands Abstract In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man.

More information

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1 Lecture 14 Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1 Outline Chapter 5 - Adversarial Search Alpha-Beta Pruning Imperfect Real-Time Decisions Stochastic Games Friday,

More information

Creating a Havannah Playing Agent

Creating a Havannah Playing Agent Creating a Havannah Playing Agent B. Joosten August 27, 2009 Abstract This paper delves into the complexities of Havannah, which is a 2-person zero-sum perfectinformation board game. After determining

More information

Ar#ficial)Intelligence!!

Ar#ficial)Intelligence!! Introduc*on! Ar#ficial)Intelligence!! Roman Barták Department of Theoretical Computer Science and Mathematical Logic So far we assumed a single-agent environment, but what if there are more agents and

More information

Lower Bounding Klondike Solitaire with Monte-Carlo Planning

Lower Bounding Klondike Solitaire with Monte-Carlo Planning Lower Bounding Klondike Solitaire with Monte-Carlo Planning Ronald Bjarnason and Alan Fern and Prasad Tadepalli {ronny, afern, tadepall}@eecs.oregonstate.edu Oregon State University Corvallis, OR, USA

More information

Playing Othello Using Monte Carlo

Playing Othello Using Monte Carlo June 22, 2007 Abstract This paper deals with the construction of an AI player to play the game Othello. A lot of techniques are already known to let AI players play the game Othello. Some of these techniques

More information

Information capture and reuse strategies in Monte Carlo Tree Search, with applications to games of hidden information

Information capture and reuse strategies in Monte Carlo Tree Search, with applications to games of hidden information Information capture and reuse strategies in Monte Carlo Tree Search, with applications to games of hidden information Edward J. Powley, Peter I. Cowling, Daniel Whitehouse Department of Computer Science,

More information

Artificial Intelligence Search III

Artificial Intelligence Search III Artificial Intelligence Search III Lecture 5 Content: Search III Quick Review on Lecture 4 Why Study Games? Game Playing as Search Special Characteristics of Game Playing Search Ingredients of 2-Person

More information

Creating a Poker Playing Program Using Evolutionary Computation

Creating a Poker Playing Program Using Evolutionary Computation Creating a Poker Playing Program Using Evolutionary Computation Simon Olsen and Rob LeGrand, Ph.D. Abstract Artificial intelligence is a rapidly expanding technology. We are surrounded by technology that

More information

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46.

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46. Foundations of Artificial Intelligence May 30, 2016 46. AlphaGo and Outlook Foundations of Artificial Intelligence 46. AlphaGo and Outlook Thomas Keller Universität Basel May 30, 2016 46.1 Introduction

More information

Towards Strategic Kriegspiel Play with Opponent Modeling

Towards Strategic Kriegspiel Play with Opponent Modeling Towards Strategic Kriegspiel Play with Opponent Modeling Antonio Del Giudice and Piotr Gmytrasiewicz Department of Computer Science, University of Illinois at Chicago Chicago, IL, 60607-7053, USA E-mail:

More information

This is a repository copy of Ensemble Determinization in Monte Carlo Tree Search for the Imperfect Information Card Game Magic: The Gathering.

This is a repository copy of Ensemble Determinization in Monte Carlo Tree Search for the Imperfect Information Card Game Magic: The Gathering. This is a repository copy of Ensemble Determinization in Monte Carlo Tree Search for the Imperfect Information Card Game Magic: The Gathering. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/75050/

More information

AN ABSTRACT OF THE THESIS OF

AN ABSTRACT OF THE THESIS OF AN ABSTRACT OF THE THESIS OF Paul Lewis for the degree of Master of Science in Computer Science presented on June 1, 2010. Title: Ensemble Monte-Carlo Planning: An Empirical Study Abstract approved: Alan

More information

CS221 Project Final Report Gomoku Game Agent

CS221 Project Final Report Gomoku Game Agent CS221 Project Final Report Gomoku Game Agent Qiao Tan qtan@stanford.edu Xiaoti Hu xiaotihu@stanford.edu 1 Introduction Gomoku, also know as five-in-a-row, is a strategy board game which is traditionally

More information

CS221 Final Project Report Learn to Play Texas hold em

CS221 Final Project Report Learn to Play Texas hold em CS221 Final Project Report Learn to Play Texas hold em Yixin Tang(yixint), Ruoyu Wang(rwang28), Chang Yue(changyue) 1 Introduction Texas hold em, one of the most popular poker games in casinos, is a variation

More information

ARTIFICIAL INTELLIGENCE (CS 370D)

ARTIFICIAL INTELLIGENCE (CS 370D) Princess Nora University Faculty of Computer & Information Systems ARTIFICIAL INTELLIGENCE (CS 370D) (CHAPTER-5) ADVERSARIAL SEARCH ADVERSARIAL SEARCH Optimal decisions Min algorithm α-β pruning Imperfect,

More information

2 person perfect information

2 person perfect information Why Study Games? Games offer: Intellectual Engagement Abstraction Representability Performance Measure Not all games are suitable for AI research. We will restrict ourselves to 2 person perfect information

More information

Adversarial Search Lecture 7

Adversarial Search Lecture 7 Lecture 7 How can we use search to plan ahead when other agents are planning against us? 1 Agenda Games: context, history Searching via Minimax Scaling α β pruning Depth-limiting Evaluation functions Handling

More information

Game Playing for a Variant of Mancala Board Game (Pallanguzhi)

Game Playing for a Variant of Mancala Board Game (Pallanguzhi) Game Playing for a Variant of Mancala Board Game (Pallanguzhi) Varsha Sankar (SUNet ID: svarsha) 1. INTRODUCTION Game playing is a very interesting area in the field of Artificial Intelligence presently.

More information

Optimal Rhode Island Hold em Poker

Optimal Rhode Island Hold em Poker Optimal Rhode Island Hold em Poker Andrew Gilpin and Tuomas Sandholm Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 {gilpin,sandholm}@cs.cmu.edu Abstract Rhode Island Hold

More information

Game Playing AI Class 8 Ch , 5.4.1, 5.5

Game Playing AI Class 8 Ch , 5.4.1, 5.5 Game Playing AI Class Ch. 5.-5., 5.4., 5.5 Bookkeeping HW Due 0/, :59pm Remaining CSP questions? Cynthia Matuszek CMSC 6 Based on slides by Marie desjardin, Francisco Iacobelli Today s Class Clear criteria

More information

6. Games. COMP9414/ 9814/ 3411: Artificial Intelligence. Outline. Mechanical Turk. Origins. origins. motivation. minimax search

6. Games. COMP9414/ 9814/ 3411: Artificial Intelligence. Outline. Mechanical Turk. Origins. origins. motivation. minimax search COMP9414/9814/3411 16s1 Games 1 COMP9414/ 9814/ 3411: Artificial Intelligence 6. Games Outline origins motivation Russell & Norvig, Chapter 5. minimax search resource limits and heuristic evaluation α-β

More information

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing COMP10: Artificial Intelligence Lecture 10. Game playing Trevor Bench-Capon Room 15, Ashton Building Today We will look at how search can be applied to playing games Types of Games Perfect play minimax

More information

AI Approaches to Ultimate Tic-Tac-Toe

AI Approaches to Ultimate Tic-Tac-Toe AI Approaches to Ultimate Tic-Tac-Toe Eytan Lifshitz CS Department Hebrew University of Jerusalem, Israel David Tsurel CS Department Hebrew University of Jerusalem, Israel I. INTRODUCTION This report is

More information

Creating a New Angry Birds Competition Track

Creating a New Angry Birds Competition Track Proceedings of the Twenty-Ninth International Florida Artificial Intelligence Research Society Conference Creating a New Angry Birds Competition Track Rohan Verma, Xiaoyu Ge, Jochen Renz Research School

More information

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Tom Pepels June 19, 2012 Abstract In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man.

More information

Drafting Territories in the Board Game Risk

Drafting Territories in the Board Game Risk Drafting Territories in the Board Game Risk Presenter: Richard Gibson Joint Work With: Neesha Desai and Richard Zhao AIIDE 2010 October 12, 2010 Outline Risk Drafting territories How to draft territories

More information

Available online at ScienceDirect. Procedia Computer Science 62 (2015 ) 31 38

Available online at  ScienceDirect. Procedia Computer Science 62 (2015 ) 31 38 Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 62 (2015 ) 31 38 The 2015 International Conference on Soft Computing and Software Engineering (SCSE 2015) Analysis of a

More information

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel Foundations of AI 6. Adversarial Search Search Strategies for Games, Games with Chance, State of the Art Wolfram Burgard & Bernhard Nebel Contents Game Theory Board Games Minimax Search Alpha-Beta Search

More information

Texas Hold em Inference Bot Proposal. By: Brian Mihok & Michael Terry Date Due: Monday, April 11, 2005

Texas Hold em Inference Bot Proposal. By: Brian Mihok & Michael Terry Date Due: Monday, April 11, 2005 Texas Hold em Inference Bot Proposal By: Brian Mihok & Michael Terry Date Due: Monday, April 11, 2005 1 Introduction One of the key goals in Artificial Intelligence is to create cognitive systems that

More information

Automatic Game AI Design by the Use of UCT for Dead-End

Automatic Game AI Design by the Use of UCT for Dead-End Automatic Game AI Design by the Use of UCT for Dead-End Zhiyuan Shi, Yamin Wang, Suou He*, Junping Wang*, Jie Dong, Yuanwei Liu, Teng Jiang International School, School of Software Engineering* Beiing

More information

CSC384: Introduction to Artificial Intelligence. Game Tree Search

CSC384: Introduction to Artificial Intelligence. Game Tree Search CSC384: Introduction to Artificial Intelligence Game Tree Search Chapter 5.1, 5.2, 5.3, 5.6 cover some of the material we cover here. Section 5.6 has an interesting overview of State-of-the-Art game playing

More information

CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions Slides by Svetlana Lazebnik, 9/2016 Modified by Mark Hasegawa Johnson, 9/2017 Types of game environments Perfect

More information

CSE 573: Artificial Intelligence Autumn 2010

CSE 573: Artificial Intelligence Autumn 2010 CSE 573: Artificial Intelligence Autumn 2010 Lecture 4: Adversarial Search 10/12/2009 Luke Zettlemoyer Based on slides from Dan Klein Many slides over the course adapted from either Stuart Russell or Andrew

More information

AlphaGo and Artificial Intelligence GUEST LECTURE IN THE GAME OF GO AND SOCIETY

AlphaGo and Artificial Intelligence GUEST LECTURE IN THE GAME OF GO AND SOCIETY AlphaGo and Artificial Intelligence HUCK BENNET T (NORTHWESTERN UNIVERSITY) GUEST LECTURE IN THE GAME OF GO AND SOCIETY AT OCCIDENTAL COLLEGE, 10/29/2018 The Game of Go A game for aliens, presidents, and

More information

Emergent bluffing and inference with Monte Carlo Tree Search

Emergent bluffing and inference with Monte Carlo Tree Search Emergent bluffing and inference with Monte Carlo Tree Search Peter I. Cowling Department of Computer Science York Centre for Complex Systems Analysis University of York, UK Email: peter.cowling@york.ac.uk

More information

Last-Branch and Speculative Pruning Algorithms for Max"

Last-Branch and Speculative Pruning Algorithms for Max Last-Branch and Speculative Pruning Algorithms for Max" Nathan Sturtevant UCLA, Computer Science Department Los Angeles, CA 90024 nathanst@cs.ucla.edu Abstract Previous work in pruning algorithms for max"

More information

CS 229 Final Project: Using Reinforcement Learning to Play Othello

CS 229 Final Project: Using Reinforcement Learning to Play Othello CS 229 Final Project: Using Reinforcement Learning to Play Othello Kevin Fry Frank Zheng Xianming Li ID: kfry ID: fzheng ID: xmli 16 December 2016 Abstract We built an AI that learned to play Othello.

More information

Creating a Dominion AI Using Genetic Algorithms

Creating a Dominion AI Using Genetic Algorithms Creating a Dominion AI Using Genetic Algorithms Abstract Mok Ming Foong Dominion is a deck-building card game. It allows for complex strategies, has an aspect of randomness in card drawing, and no obvious

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Adversarial Search Instructor: Stuart Russell University of California, Berkeley Game Playing State-of-the-Art Checkers: 1950: First computer player. 1959: Samuel s self-taught

More information

Playout Search for Monte-Carlo Tree Search in Multi-Player Games

Playout Search for Monte-Carlo Tree Search in Multi-Player Games Playout Search for Monte-Carlo Tree Search in Multi-Player Games J. (Pim) A.M. Nijssen and Mark H.M. Winands Games and AI Group, Department of Knowledge Engineering, Faculty of Humanities and Sciences,

More information

4. Games and search. Lecture Artificial Intelligence (4ov / 8op)

4. Games and search. Lecture Artificial Intelligence (4ov / 8op) 4. Games and search 4.1 Search problems State space search find a (shortest) path from the initial state to the goal state. Constraint satisfaction find a value assignment to a set of variables so that

More information

Analysis and Implementation of the Game OnTop

Analysis and Implementation of the Game OnTop Analysis and Implementation of the Game OnTop Master Thesis DKE 09-25 Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science of Artificial Intelligence at the Department

More information

CSC321 Lecture 23: Go

CSC321 Lecture 23: Go CSC321 Lecture 23: Go Roger Grosse Roger Grosse CSC321 Lecture 23: Go 1 / 21 Final Exam Friday, April 20, 9am-noon Last names A Y: Clara Benson Building (BN) 2N Last names Z: Clara Benson Building (BN)

More information

Programming Project 1: Pacman (Due )

Programming Project 1: Pacman (Due ) Programming Project 1: Pacman (Due 8.2.18) Registration to the exams 521495A: Artificial Intelligence Adversarial Search (Min-Max) Lectured by Abdenour Hadid Adjunct Professor, CMVS, University of Oulu

More information

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search CS 2710 Foundations of AI Lecture 9 Adversarial search Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square CS 2710 Foundations of AI Game search Game-playing programs developed by AI researchers since

More information