Parallel Go on CUDA with. Monte Carlo Tree Search

Size: px
Start display at page:

Download "Parallel Go on CUDA with. Monte Carlo Tree Search"

Transcription

1

2 Parallel Go on CUDA with Monte Carlo Tree Search A thesis submitted to the Division of Research and Advanced Studies of the University of Cincinnati in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in the School of Computing Sciences and Informatics of the College of Engineering and Applied Sciences October 2012 by Jun Zhou B.S., University of Michigan, 2008 Thesis Advisor and Committee Chair: Dr. Kenneth Berman

3 Abstract Traditional Go AI uses minimax tree search with pruning optimizations and board evaluation function to dictate moves in a sequential fashion. However It is widely accepted that professional human players are far better than minimax tree search based Go AI, due to lack of a decent Go evaluation function and its astronomical search space. Recent development of Monte Carlo Tree Search (MCTS) based Go AI sees a big surge in its playing strength. With the emergence of CUDA, Nvidia s massively parallel GPU platform, it appears the MCTS process can be parallelized at the simulation stage by the GPU to enhance its performance. This thesis takes on the challenge of building a Monte Carlo Tree Search algorithm to play Go on the manycore CUDA platform. 1

4 2

5 Preface Ever since the invention of the computer, our scientists forefathers have long dreamed of harnessing its tremendous computational power to conquer board games. In the last few decades we witnessed the fast-paced improvement in microprocessors as predicted by the famous Moore s Law. As computers become much cheaper and faster, the programs which run on the computers benefit from this directly. The most famous Man vs Machine story has to be the match between Garry Kasparov and IBM s Deep Blue in the late 1990s, which left a monumental mark in the computing history because it is the first time a computer chess AI has convincingly defeated the world s best grandmaster. The extraordinary feat of Deep Blue is easily repeated nowadays by home personal computers; it is estimated that a top-notch computer chess AI can can have upward of playing strength while top human grandmasters are only around Consequently, AI researchers began shifting their attention to another board game named Go. Go is an ancient Chinese board game with relatively simple rules that one can learn in minutes but take a 3

6 4 lifetime to master. Currently even the strongest Go AI is considerably weaker[13] than professional human Go players for reasons that will be discussed in later chapters.

7 Acknowledgements I dearly thank Dr. Berman for teaching me all these years, making me work hard, pushing me forward and providing me with countless research advices. I also thank Dr. Annexstein for teaching me CUDA, which is what inspired this research in the first place. I also thank Dr. Bhatnagar for teaching me Artificial Intelligence, many of my ideas are from his AI class. 5

8 Contents Abstract 1 Preface 3 Acknowledgements 5 1 Introduction Origin of Go Go Terminology Rules Thesis Organization Monte Carlo Tree Search Challenges with Conventional Minimax Tree Board Size Additive Complexity Evaluation Function

9 CONTENTS MCTS Overview MCTS Process Selection Expansion Simulation Backpropagation CUDA Architecture Background Advantages Limitations Tesla C Implementation High-Level Game Flow CUDA Configuration Running Simulation in Parallel Biased Monte Carlo Sampling via Evaluation Function GPU Parallel Optimization Shared Memory Memory Padding Result BackPropagate Benchmark 32

10 CONTENTS Thread Configurations Board Sizes Biased Monte Carlo Tree Search Performance Impact of Different Monte Carlo Policies Optimizations and Speedup CPU vs GPU Conclusions Verdict Future Works A Go Ranking Illustration 45 B Go Board Struct 46 C Go Board Intersect 47 Bibliography 49

11 List of Tables 3.1 Tesla C1060 Key Specifications A.1 Go Ranking [Low-High]

12 List of Figures 1.1 Go Stones and Board MCTS Iterations CPU vs. GPU Computing CUDA Processing Flow CUDA Memory Hierarchy Tesla C1060 Graphics Card Thread Configurations Effects on 19x19 Board Different Board Sizes Playing Speed of Different MCTS Policies Optimizations and Speedup CPU vs GPU C.1 Go Board Intersect

13 Chapter 1 Introduction 1.1 Origin of Go Go might not be the most popular game in the world, but it is certainly one of the oldest and time-tested throughout history. It originated in China more than 2500 years ago, and was introduced to the West through Japan. The Chinese name for Go is Weiqi, which translates to encircling game. Indeed, the idea of encircling is a fundamental concept in the game. The game is noted for its simple physical composition: a 19x19 board, uniform pieces called stones, and easy playing rules. 1

14 CHAPTER 1. INTRODUCTION Go Terminology Figure 1.1: Go Stones and Board To understand the grand concept of Go, here is some of the most basic terminology defined: stone: a black or white colored pieces that is played on the board. group: a set of stones connected by the lines, also called a chain. liberty: a vacant point immediately adjacent to a stone in cardinal (either horizontal or vertical but not diagonal) direction; a single stone can have up to 4 liberties. komi: a number of handicap stones awarded to the weaker player; 19x19 board often uses as komi. There are many more Go-specific terms, but because this thesis focuses on MCTS parallelism and the CUDA platform they will only be explained 1 The komi of X entails that there is always a winner: no ties are possible.

15 CHAPTER 1. INTRODUCTION 3 as necessary. 1.3 Rules Two players, one using black stones and the other using white stones, alternate placing stones on the board. Each stone must be placed on the intersection of two lines in the 19 x 19 grid. If one side is known to be stronger, a handicap can be used. A handicap allows the weaker side to set extra stones on the board before game starts. There are also 13x13 and 9x9 boards which are commonly used by beginners. The goal of the game is to use stones to surround as much territory as possible. Once a piece is played, its location on the board is set, and may not be moved again unless it is captured. A capture happens if the stone or stones have no liberty. The game finishes if there are two consecutive passes or if one player resigns. The player with more territory at the end of the game wins. 1.4 Thesis Organization Here s a brief overview of successive chapters: Chapter 1: Introduction covers the basic concepts and rules of Go. Chapter 2: Monte Carlo Tree Search mentions the shortcomings of applying conventional minimax tree search techniques to Go, and delves into a Monte Carlo Tree based search algorithm for Go.

16 CHAPTER 1. INTRODUCTION 4 Chapter 3: CUDA Architecture dives deep into Nvidia powered CUDA architecture, and compares it with CPU architecture. Chapter 4: Implementation discusses the actual implementation stepby-step, provides variants of Monte Carlo Tree Search, and highlights the differences from the CPU implementation. It also explains several unique challenges SIMD architecture encounters on the GPU, and their impact on performance. Chapter 5: Benchmark shows the testing environment and various benchmarking results from the aforementioned manycore implementation. Chapter 6: Conclusion assesses overall project goals and limitations and delivers future thoughts.

17 Chapter 2 Monte Carlo Tree Search 2.1 Challenges with Conventional Minimax Tree Knowing Go predates chess by more than a millennium, one might guess Go is a simpler game to play and therefore more easily solved by a computer. Moreover, one might also conjecture that the same minimax techniques which work well for chess can also be applied to Go. However, this is not true. One should not underestimate the simple pieces and rules that make up the game of Go. The following sections highlight why Go is a fundamentally different problem for computers to tackle compared to chess. 5

18 CHAPTER 2. MONTE CARLO TREE SEARCH Board Size The 19x19 Go board has 361 line intersections, while the 8x8 chess board only has 64. Due to this fact, the number of possible positions that can be played on the Go board is far greater than that of chess. More precisely, the average legal moves, at any moment during the game, is about 30 for chess; Go begins with 55 distinct possible legal moves (with symmetric duplicates removed). As the game progresses and once the symmetry is broken, the number of legal moves explodes to the point where most of the 361 positions require evaluation. Some intersects are more popular than others, but all are allowed Additive Complexity As games such as chess, checkers, and backgammon, etc. progress, pieces get eliminated which cause the game board to simplify therefore decreasing the complexity. In Go, as the game continues, most pieces retain their positions on the board with the exception of a few captures moves (extremely rare in professional level games). This not only eliminates board symmetry, but also adds more tension and new possibilities to the board, rendering an everlasting high complexity and branching factor.

19 CHAPTER 2. MONTE CARLO TREE SEARCH Evaluation Function For any tree based search heuristics that are applied to board games, an evaluation function is required to count materials and calculate the positions on the board to in order to assign scores for both sides so the AI engine is fully aware of the values of the game states. In other words, the evaluation function is the backbone of minimax tree search, which helps to decide which of the possible search paths to steer towards. In chess, an accurate evaluation function can be derived from a number of considerations which not only count up the raw material advantage on the board, but also include many other examinations such as king position, queen position, doubled pawns, isolated pawns, passed pawns, rooks on open files, bishop pairs, etc. These observations can be made easily and formulated into rules to construct an accurate and strong evaluation function. But similar techniques cannot be applied efficiently to Go. An accurate evaluation of a Go position requires complex analysis such as finding death of stones and groups, whether the stones can be connected to avoid capture, and whether an attack should be played in place of defense. Much of the time, more than one stone placement can be good, depending on the side s strategy and other delicate trade-offs. For example, it can be debatable whether one should kill a few enemy stones at the cost of allowing the enemy s stones elsewhere to strengthen. Also, sometimes the present placement of a stone might not seem influential but after many moves it suddenly becomes crucial.

20 CHAPTER 2. MONTE CARLO TREE SEARCH 8 All in all, an accurate evaluation function for Go is simply non-existent. Currently, even the best evaluation function based Go AI still only performs at kyu 1 levels. 2.2 MCTS Overview Monte Carlo Tree Search(MCTS)[7][8][9] is an innovative tree search method that has received the spotlight in recent years of Go artificial intelligence research. It adapts to utilize Monte Carlo random sampling techniques to tackle the astronomical search space which previously haunted traditional minimax tree search methods. Mainly due to the high branching factor and poor evaluation function in Go, traditional minimax tree search struggles to find the optimal decision. In comparison, MCTS cleverly uses statistics collected by randomized samplings to estimate the true value of a node, and builds the tree iteratively to help in finding an optimal decision. MCTS relies on an important concept: the true value of a node can be approximated by running enough Monte Carlo simulations, so the policy can adjust to the best-first strategy based on the statistics. MCTS has succeeded on difficult problems where other techniques come up short. It is a statistical sampling technique, which means more runtime and more processing power generally achieves a better performance. In terms of flexibility, it can be used with no domain knowledge at all in some instances 1 Go rankings are illustrated in Appendix A.

21 CHAPTER 2. MONTE CARLO TREE SEARCH 9 or can be used with little domain knowledge 2 to help the decision process. 2.3 MCTS Process Since MCTS is based on random sampling, it runs many iterations to ensure the accuracy and precision of collected statistics. Each iteration has four stages: Selection, Expansion, Simulation, and Backpropagation. Figure 2.1[6] illustrates this iterative process. Figure 2.1: MCTS Iterations Selection Selection refers to the process of choosing the optimal child node among all of the expandable child nodes in the tree. A node is expandable if it is not 2 In Go, domain knowledge consists of simple rules to bias the node selection towards more advantageous nodes.

22 CHAPTER 2. MONTE CARLO TREE SEARCH 10 a terminal state and it has unvisited children. The node selection problem faces the dilemma of exploration versus exploitation[10]. Exploration-biased heuristics favor less examined nodes while exploitation-biased heuristics favor well-established nodes. Thus a policy must be defined to resolve this conundrum. Auer et al. [1] proposed a policy called Upper Confidence Bounds(UCB) 3 which has a great variant for tree search, named Upper Confidence Bounds for Trees(UCT). UCT is simple and efficient, and guaranteed to be within a constant factor of the best possible bound on the growth of the regret[2], which is defined as the difference between the optimal selection and UCT selection. Therefore it is a widely adopted method to balance exploitation and exploration. The heuristic assigns each node a UCT score, and simply picks the node with highest score. UCT = X j + 2C p 2 ln n n j C p is a constant, usually set to 1/ 2 to satisfy the Hoeffding inequality with rewards in the range [0,1]; this is examined in details by Kocsis and Szepesvari[12]. X j is the ratio of the number of wins over the number of node visits, which translates to winning percentage for the child node j; this value is always no more than 1. n is the number of times the parent node is visited, where n j is the number 3 It is also known as multi-armed bandit problem[14].

23 CHAPTER 2. MONTE CARLO TREE SEARCH 11 of times the child node, also the current node j is visited. The UCT bound value can be intuitively understood as the sum of the two parts: the first part X j representing the ratio of winning which encourages exploration of promising nodes with high win ratio and the second part 2C p 2 ln n n j encourages visiting nodes that have not been well sampled. As more simulations execute, the statistics get updated in all the nodes. X j captures the simple win ratio of each node state. However, this ratio alone is not good enough to decide which child node j to expand next. For instance, consider a node A and a node B respectively having 1/2 and 20/100 as X j values. If only X j is considered, it seems node A is a sure pick because of the 0.5 win ratio over node B s 0.2 win ratio. However it overlooks the fact that node B s ratio has a much higher confidence level, and statistics for node A could deteriorate quickly if more simulations are run. Vice versa, consider a node C and a node D with 1/2 and 70/100 X j values. It is tempting to always favor node D because it has a high ratio and high confidence. But if node C isn t examined further, the algorithm could miss out on a potentially even higher win ratio from node C. Thus a balance must be achieved in choosing child node j. 2C p 2 ln n n j is the second part of the formula, it gives incentive to examine less explored nodes. n j is 0 for never visited nodes, UCT of which yields to infinity; therefore all the nodes are guaranteed at least one visit. In summary, the less explored nodes have small denominators n j which would make the

24 CHAPTER 2. MONTE CARLO TREE SEARCH 12 overall quotient big, thus contributing more to the UCT value Expansion Once a node is selected, according to available (legal) actions, child nodes are added to expand the selected node. The application can control the policy, which regulates expandable nodes as it can define rules to determine which of the child nodes are possible. After expansion, sometimes a filtering function is also applied to prune the child nodes in order to eliminate obvious sub-optimal ones Simulation Starting from the selected node, the simulation follows a simulation (also called playout) policy until game completion to produce an outcome. The simulation policy, a key ingredient, uses simple strategies to map game states into actions, and also contains rules that restrict the legality of child nodes; but on the other hand, it should not be overly restrictive as it can degrade random sampling processes. The simulation policy is not equivalent to a full strategy, which is much more expansive and complicated; the policy usually is simpler and leaves enough room for randomness. It should be noted that a policy-less simulation can have a potentially large number of child nodes and render the collected statistics less useful.

25 CHAPTER 2. MONTE CARLO TREE SEARCH Backpropagation The outcomes from simulations get propagated backwards from leaf nodes to the root node in order to update statistics. Based on the statistics, using UCT (multi-armed bandit) heuristics, the application loops back to the node Selection stage.

26 Chapter 3 CUDA Architecture 3.1 Background Compute Unified Device Architecture (CUDA) is a recent parallel computing platform developed by Nvidia Corporation. It is neither the first parallel computing platform, nor the first GPU enabled parallel programming construct. However it is arguably the first popularized parallel GPU architecture that sees wide use in both commercial and academic fields. CUDA itself is a massively parallel architecture often containing more than 100 cores; the newer models even have more than 1000 cores. This poses a sharp contrast to traditional CPU architecture, which often contains no more than 4 to 8 cores. While a CUDA GPU can have hundreds of cores executing instructions in parallel, each core is a lot slower than a CPU core. CUDA emphasizes 14

27 CHAPTER 3. CUDA ARCHITECTURE 15 Figure 3.1: CPU vs. GPU Computing

28 CHAPTER 3. CUDA ARCHITECTURE 16 running many threads concurrently as opposed to running a single thread very quickly. This architecture is also known as Single Instruction Multiple Data (SIMD). The underlying premise is that, if a task can be saturated with high enough volume of input data, the overall performance can improve due to the large scale batch processing. Figure 3.1 demonstrates the peak computing capability in GFLOPS 1 of CUDA graphic cards compared to Intel CPUs. The programming on the GPU is vastly different from doing so on the CPU, primarily due to the architectural difference and supported instruction set. Historically it is never an easy task to migrate an application running on the CPU to the GPU and also improving its performance. The beauty of the CUDA platform is that it allows the use of C Programming Language to write programs largely the in the same fashion with minimal modifications. Using C also favors portability, as a great number of programs were written in C language to run on the CPU can be relatively easily modified to run on a CUDA platform Advantages Ease of programming C is one of the most popular programming languages. Comparing to OpenCL and DirectCompute, C is more straight-forward and general-purpose, thus more user friendly. 1 Giga-floating-point operations per second

29 CHAPTER 3. CUDA ARCHITECTURE 17 Figure 3.2: CUDA Processing Flow

30 CHAPTER 3. CUDA ARCHITECTURE 18 Figure 3.3: CUDA Memory Hierarchy

31 CHAPTER 3. CUDA ARCHITECTURE 19 Computation capability CUDA enabled graphics cards can achieve more than 100GFLOPS, making them suitable for computationally expensive tasks. Shared memory Each multiprocessor within the device is allocated a region of shared memory, which is extremely fast and have low latency, and can be shared among different threads within the same block. User-managed cache Flexible caching configurations to help improve read performance Limitations Divergence penalties - Any SIMD architecture is restricted by the inability to concurrently execute different logic branches simultaneously, this can impact applications of heavy branching nature. Incomplete C/C++ support Though CUDA implements basic constructs of C, not all features are supported. e.g. dynamic allocation, full class support, virtual function, various libraries etc. Thread configuration Threads need to be cautiously set up and configured to achieve reasonable performance. More specifically, threads need to be run in warps 2 and each core should be saturated with threads in order to mask memory access latency. 2 In current CUDA implementation, a warp contains 32 threads.

32 CHAPTER 3. CUDA ARCHITECTURE 20 Small shared memory Each multiprocessor contains 16-48KB of shared memory, it could pose a challenge to fit large indivisible graph with many nodes Tesla C1060 Figure 3.4: Tesla C1060 Graphics Card The Tesla C1060 card is used for this thesis. It is a professional grade CUDA enabled Nvidia graphics card manufactured in Some of the key hardware specifications will determine the optimal thread configuration in the experiment.

33 CHAPTER 3. CUDA ARCHITECTURE 21 CUDA Capability Major revision number 1 CUDA Capability Minor revision number 3 Total amount of global memory bytes Number of multiprocessors 30 Number of cores 240 Total amount of constant memory bytes Total amount of shared memory per block bytes Total number of registers available per block Warp size 32 Maximum number of threads per block 512 Maximum sizes of each dimension of a block 512 x 512 x 64 Maximum sizes of each dimension of a grid x x 1 Maximum memory pitch bytes Texture alignment 256 bytes Clock rate 1.30 GHz Concurrent copy and execution Yes Run time limit on kernels No Integrated No Support host page-locked memory mapping Yes Table 3.1: Tesla C1060 Key Specifications

34 Chapter 4 Implementation In this implementation, the GNU licensed freely available Go infrastructure library Fuego is used in order to avoid re-inventing the basic building blocks of Go such as user interface, basic logic flow, game rules etc. Fuego programming library grants researchers a foundation to work with rather than having to deal with trivial programming details, thus allowing more emphasis to be put on actual algorithms and heuristics. A portion of the source code also borrows from GNU Go frameworks. 4.1 High-Level Game Flow Using the four stage Monte Carlo Tree Search and its random sampling process, the procedure to find the optimal next move can be illustrated in pseudo code below: 22

35 CHAPTER 4. IMPLEMENTATION 23 function MonteCarloSearch(state 0 ) create root node node 0 with state state 0 while nodecount maximum allowed nodes chosennode = TreePolicy(node 0 ) result = SimulationPolicy(chosenN ode) BackPropagate(result, chosenn ode) return best scoring node Monte Carlo Tree Search belongs to the family of tree search algorithms, thus a game state is represented by a node in the tree. Each node needs to keep track of the Go board information, which contains the black and white stone positions and captures; moreover it needs to store win and loss counts of the node. The heuristics start off by constructing a root node which represents the current game state, and goes on to pick out the most opportune child node. The child node is first expanded, and runs Monte Carlo simulations by invoking the TreePolicy function which determines the best child node via upper confidence bound methods (UCT). Once the appropriate child is chosen, SimluationPolicy function takes over and a predetermined number of Monte Carlo simulations are run to generate statistics on the chosen node. Afterwards the results (wins and losses) get propagated back all the way to the root node. From there the algorithm loops back to the

36 CHAPTER 4. IMPLEMENTATION 24 node selection stage, thus iteratively constructing a Monte Carlo search tree by selecting and expanding one node at a time. If a terminal condition such as maximum allowed time or maximum allowed nodes are met, search is immediately interrupted. It is easy to determine the best child based on the collected statistics and return it as the next Go move. function TreePolicy(node 0 ) leafnodes = getallleafnodes(node 0 ) foreach i in leafnodes score i = winratio i + 2C p 2 ln totalv isits nodev isits return node i with max score i TreePolicy function is fairly straightforward. It uses an iterator to loop through all the leaf nodes of the root node, and put them in a set. The UCT formula is used to determine which leaf node SimulationPolicy should start to execute from. The gist of UCT is that, in addition to the win ratio, it also adds a weight which favors less explored nodes. Under UCT, the balance between exploration and exploitation is achieved. function SimulationPolicy(chosenN ode) while!winningcondition trynakademove() else

37 CHAPTER 4. IMPLEMENTATION 25 tryataricapturemove() else tryataridefensemove() else trylowlibertymove() else trypatternmove() else trycapturemove() else tryrandommove() else trypassmove() return move SimulationPolicy takes a board state, and tries to generate a move based on a number of Go-specific functions from a priority list. Nakade 1 heuristics are tried first, then Atari 2 moves, low liberty moves, pattern matching moves, capture moves, random moves and lastly pass moves are also possible. The ordering of this priority list is important and and any adjustment can result in performance fluctuation. 4.2 CUDA Configuration The thread configuration in CUDA architecture is intricate and can impact the performance heavily. If only a limited number of threads execute in parallel, the majority of CUDA cores computational power is under-utilized. 1 It literally means inside move in Japanese, refers to the situation where a group of stones can be made into two eyes or prevented from doing so by a single move. 2 A group of stones with only one liberty.

38 CHAPTER 4. IMPLEMENTATION 26 On the contrary, if far too many threads are allocated, the resulting divergent execution paths and unmanaged memory access patterns can be disastrous. Therefore in this thesis, a variety of configurations are examined to illustrate the repercussions of each setup. Based on these results, the application can find the optimal configuration. CUDA threads must be configured with grid dimension 3 and block dimension 4. The default setup is a grid which consists of 512 blocks and each block contains 512 threads. This configuration is found to be the optimal after performance analysis with different configurations ranging from 64 blocks to 512 blocks and 32 threads to 512 threads. The detailed thread configurations and their impact on the performance is shown in Benchmark chapter. 4.3 Running Simulation in Parallel CUDA Go aims to parallelize Monte Carlo Simluations, in other words, parallelization happens at leaf node. Here is a snapshot of pseudo code that first runs on the CPU before launching a CUDA kernel that runs purely on the GPU: function CUDAMonteCarloSearch(state 0 ) create root node node 0 with state state 0 3 A grid can contain up to 512 x 512 x 64 blocks in each of x y z dimension. 4 A block can contain up to 512x512x64 in each x y z dimension but only maximum of 512 threads per block.

39 CHAPTER 4. IMPLEMENTATION 27 while nodecount maximum allowed nodes chosennode = TreePolicy(node 0 ) result = CUDASimulationKernel(chosenN ode) BackPropagate(result, chosenn ode) return best scoring node CUDAMonteCarloSearch function first runs on the CPU much like MonteCarloSearch method in the sequential counterpart. The TreePolicy is handled sequentially on the CPU since this part of the algorithm is inherently linear. As the application reaches CUDASimluationKernel, a CUDA kernel is launched with user managed configurations, and the process copies the initial board state from the CPU memory to the GPU memory and executes thousand of threads in parallel. On the GPU, the kernel does not distinguish amongst threads; this is to minimize divergence. Inside the GPU, each thread manages its own Go board state in the global memory 5. After each thread terminates, score is stored in a pre-allocated array in the global memory region; threads are synchronized to ensure termination. The scores are accumulated by a fast CUDA Thrust library powered prefix sum function which runs in log N time. The results get transferred back from the GPU to the CPU memory and is back-propagated all the way to the root node. With the updated nodes statistics, the game engine utilizes the UCT formula 5 Shared memory is also used to reduce memory access delay.

40 CHAPTER 4. IMPLEMENTATION 28 once again to decide which of the next nodes should be explored and hands off that node to the GPU, therefore iteratively constructing a Monte Carlo Search Tree. function CUDASimulationKernel(chosenN ode) create root node node 0 with state state 0 while!gameendingconditions SimulationPolicy(chosenN ode) resultarray[threadidx] = result SynchronizeAllThreads() stats = inclusivescan(resultarray) return stats When the CPU hands off the work to the GPU, each spawned thread on the GPU works independently by following a set of rules formally defined in the SimluationPolicy function. After each game is played to completion, the result is saved in the respective index that corresponds to the unique thread ID in the global array. The global array is accumulated by a fast inclusive scan function to generate the total number of wins achieved from the initial game state before these statistics are passed back to the CPU and continues in the main loop.

41 CHAPTER 4. IMPLEMENTATION Biased Monte Carlo Sampling via Evaluation Function The original Monte Carlo Sampling is uniform, meaning it does not bias towards any of the generated moves. While this is the standard Monte Carlo approach, it can lead to rather bizarre positions that a human Go player is not likely to play. The root problem is that, the engine makes no distinction of positions that can potentially influence game differently. Intuitively, stronger moves should be played more frequently than weaker moves. It appears that, if these moves are given weights according to an evaluation function which assigns scores to all of the moves, a better simulation might be achieved. Let the evaluation function E access all of the generated moves m i M, and save the result as r i r. Next, the result vector is normalized to 1; a new vector containing normalized weights of each move R i R is formed. Now the Monte Carlo simulation probabilistically selects the next move based on the weights associated with the moves. Note that the evaluation function in Go is far from perfect. The best Go program that utilizes minimax tree search and evaluation functions still cannot achieve dan level strength[15]. In essence, the evaluation relies on many Go-specific tactics and rules to reason the strategies and examine territories on the board, which combines to a total score at the end. Of course the score is far from accurate in many cases, but the assessment can be useful in

42 CHAPTER 4. IMPLEMENTATION 30 distinguishing positions which a standard MCTS cares not to. The results of unbiased Monte Carlo process versus biased Monte Carlo process is presented later on in Benchmark chapter. 4.5 GPU Parallel Optimization Shared Memory Shared memory is an important aspect of CUDA. It can be seen as user managed cache. In order to achieve speedup, the application needs to make use of the shared memory and allow as much memory to load from it as possible. C1060 Tesla provides 16KB of shared memory per multiprocessor. It has a total of 30 multiprocessors thus giving 30*16KB shared memory. The struct 6 to keep a Go game board occupies about 400 bytes. Therefore bytes/400bytes = 1200 boards can be allocated in the shared memory. Certainly this optimization will have a positive impact on the overall performance, due to much faster memory load from the on-chip shared memory as opposed to the distant global memory Memory Padding A lot of the simulation policy rules require LookUp, LookDown, LookLeft and LookRight functions which are used to check positions adjacent to the current 6 Refer to Appendix B for more details.

43 CHAPTER 4. IMPLEMENTATION 31 intersect. CUDA architecture provides 16 memory banks 7, therefore if any of the half-warp threads access through the same memory bank, a conflict would occur. To combat this drawback, the application allocates extra columns in the memory region so that none of the adjacent intersect lookups would cause memory bank conflict. This technique speeds up memory read Result BackPropagate CUDA Go needs to be smart on how to transfer back the game simulation results. If it is done carelessly, meaning accumulating all of the results using a single global variable with atomic lock, significant write race-condition would occur and the generated delays are non-trivial. To avoid the write conflicts, the threads of the CUDA Go first write the results to the pre-allocated array in the GPU global memory according to the indices formed by the unique thread IDs. An inclusive scan accumulator function is then used to tally the results in log N time before transferring the total sum to the CPU memory. Doing it this way, not only does the application avoid the write conflict, but also speeds up accumulation. 7 It s also known as the 16 stride memory bank.

44 Chapter 5 Benchmark 5.1 Thread Configurations In CUDA there can be a variety of thread configurations. Different configurations can lead to drastic performance disparities. In general, the number of threads per block should be in a multiple of warp (32 threads), and the number of threads per block should be abundant enough to saturate the cores. As illustrated in figure 5.1, there needs to be at least 128 blocks in the grid to have a reasonable performance; however increasing the block number beyond 256 gains little to no additional performance due to hardware saturation. It is also observed that the number of threads per block tops out near 256, and 512 threads per block gains no significant performance. The slope of the 64-blocks line is sharper than other lines below, which means that the performance increase by adding more threads to the 64-blocks configuration 32

45 CHAPTER 5. BENCHMARK 33 is more evident than the higher-blocks lines. This is because when there are fewer blocks, the hardware is less saturated, therefore each new thread can utilize more hardware power. In the 512-blocks configuration, the hardware is already heavily occupied, thus the addition of new threads only gains limited performance much like the effect of diminishing return. Figure 5.1: Thread Configurations Effects on 19x19 Board 5.2 Board Sizes Though 19x19 sized Go board is a standard, it s not uncommon for amateurs to play on 9x9 or 13x13 boards for learning purposes. Computer Go programs

46 CHAPTER 5. BENCHMARK 34 also play on different boards with different sizes for academic reasons. In figure 5.2, it comes as no surprise that smaller boards allows faster simulations due to less memory footprint and fewer stones to be played until completion. The fitted curve shows a relation close to that of quadratic function. The underlying reason is that as the board gets smaller, the number of playable next moves shrinks at a non-linear pace, thus resulting in a much faster simulation per game. Figure 5.2: Different Board Sizes

47 CHAPTER 5. BENCHMARK Biased Monte Carlo Tree Search Black side uses the biased MCTS engine while white side uses the original unbiased MCTS engine. Each side plays simulations to determine the next move, and time[11] is unlimited. 9x9 13x13 19x19 Black wins (games) 51/100 52/100 11/20 White wins (games) 49/100 48/100 9/20 1 Avg game length (moves) Avg black total time spent per game (second) Avg white total time spent per game (second) Avg black time spent per move (second) Avg white time spent per move (second) As the results produced by having two engines battle against one another demonstrate, the biased MCTS Go engine achieves 51%, 52% and 55% win ratio for 9x9, 13x13 and 19x19 board respectively; however the biased engine also uses up 6.8%, 14.1% and 29.4% more time than the unbiased MCTS Go engine. This data hints that the biased algorithm which uses weights to decide the next move can increase the playing strength of the Go engine. Intuitively, as MCTS generates a string of possible moves, having biased weights which favor stronger moves predicted by the evaluation function help to play the moves more sharply (stronger), thus achieving a more accurate result which is one step closer to perfect play than the previously

48 CHAPTER 5. BENCHMARK 36 unbiased MCTS engine. The overall statistics collected by the biased MCTS are more competitive, resulting in an increase in playing strength. It is also observed that the bigger boards get more benefits from the biased MCTS, as the winning ratio jumped from 51%(9x9 board) to 55%(19x19 board). A logical explanation could be that, when the board is smaller, there are less available next moves to choose from, so the biased effect on each move is not as significant as it is for a bigger board with many candidate moves to pick from. This phenomenon can be seen as the evaluation function plays a more important role on bigger boards due to larger number of possible next moves, since the larger boards have a much higher probability to arrive at a permutation of moves that is different from biased MCTS weights. However, the lift on the playing strength does not come freely. The computing time for the biased MCTS Go engine sees an increase of 6.8%, 14.1% and 29.5% going from small to big boards. This additional cost comes from more available moves that need to be accessed by the evaluation function which is not cheap to invoke because it involves many Go-specific rules and knowledge such as strategical and territorial analysis.

49 CHAPTER 5. BENCHMARK Performance Impact of Different Monte Carlo Policies Divergence is a big issue for any SIMD architecture and therefore the CUDA Go engine is subject to this problem. As an indication of the level of impact the divergence has on CUDA Go engine, this experiment modifies the MCTS so that it generates the next move randomly instead of following a predefined policy (only requirement is move has to be legal). Doing so avoids executing many different branches of rules within the original policy which causes significant divergence. Figure 5.3 shows approximately a seven fold speed increase over the standard MCTS engine. In other words, the policyless engine can play games seven times faster than the original. Nonetheless, the playing strength of policy-less Go engine suffers heavily due to inaccurate statistics generated from the simulations. It almost never wins a game against a normal engine that follows policy. 5.5 Optimizations and Speedup Due to the underlying divergent execution paths issues and heavy memory footprints, a straightforward implementation that works well for CPU can hit severe performance roadblocks in the CUDA GPU architecture. Several optimizations are applied to the baseline implementation. As figure 5.4 shows, the largest speed up come from setting appropriate blocks per grid

50 CHAPTER 5. BENCHMARK 38 Figure 5.3: Playing Speed of Different MCTS Policies

51 CHAPTER 5. BENCHMARK 39 and threads per block parameters. Another obvious improvement credits itself to the use of shared memory. Memory padding and inclusive scan also add observable performance increase to the baseline engine. Figure 5.4: Optimizations and Speedup 5.6 CPU vs GPU Given the architectural differences in CPUs and GPUs in general, it is interesting to benchmark the two head to head in order to understand if the CUDA GPU architecture is fit for solving Go using Monte Carlo Tree Search methods. In figure 5.5, the comparisons demonstrate that the performance of

52 CHAPTER 5. BENCHMARK 40 CUDA definitely outshines that of Pentium 4 single core CPU, yet falls short to the much newer generation quad core AMD Phenom processor. Although not benchmarked in this experiment, it is predictable that with the newest generation Fermi architectured Nvidia graphics cards, the GPU performance should overtake AMD Phenom processor convincingly. Figure 5.5: CPU vs GPU

53 Chapter 6 Conclusions 6.1 Verdict Evaluating the success of the parallelized CUDA Go using MCTS is intricate. Parallel Monte Carlo Tree Search is certainly a sound methodology to tackle problems of large search space, as high branching factors lead to astronomical search nodes. The parallelism essentially can be exploited at Monte Carlo simulation stage, where a massive number of threads can spawn and help to gather high quality statistics. In Go s case, it is able to determine which of the next moves is the most suitable based on the tried paths and their simulated results. In this thesis CUDA is selected as the underlying platform for the implementation for two reasons. First, it is a novel idea and a contribution 41

54 CHAPTER 6. CONCLUSIONS 42 to utilize the GPU to solve Go using MCTS. Secondly, CUDA has hundreds of cores and it is architecturally different from the CPUs. On a normal CPU, threads can execute independently and branch freely regardless of what other threads do. However in CUDA, any divergence in the thread execution paths can take a heavy toll on the performance and serialize the threads executions in the warp. A great portion of Monte Carlo simulation processes unavoidably rely on randomization and thus creating divergent executions. For instance, one thread is looking at a stone to calculate its liberty while the other thread is performing pattern matching heuristics. In CUDA it s often the case that, an application gets more speed up if it heavily performs arithmetics such as add, subtract, and multiply etc. In MCTS Go s scenario, most operations are branching and memory loading; therefore the improvement is not as drastic as other applications might encounter. Various optimizations are also applied to the baseline implementation to achieve more speedup. It s often the case that, a straightforward implementation which works well for the CPU would not perform adequately for the GPU. Though the CUDA platform has hundreds of cores, a GPU core is more than two orders of magnitude slower than a CPU core. The proper thread configurations ensure the graphics card keeps all the cores occupied and achieves maximum throughput while the shared memory usage and the memory paddling serve to accelerate memory read accesses. Many of these optimizations are GPU-only and hardware specific. Ideally one should not

55 CHAPTER 6. CONCLUSIONS 43 have to consider these limitations in order to design an efficient parallel system, as the hardware specifics are best to be abstracted away from theoretical standpoint. The idea of using the evaluation function to build a biased MCTS engine is also a contribution. Interestingly the overall playing strength sees an increase at the cost of consuming more computation time. This can become useful in long games in which time requirement is more flexible. The Go engine can switch between the unbiased and the biased mode depending on the remaining time and importance level of the next move. This can be the first step towards making a stronger MCTS Go playing engine. Overall this implementation gives great insight on how SIMD architecture can benefit from the parallelism as well as the limitations it can pose. Particularly the biased MCTS algorithm is able to produce stronger playouts during the simulation stage. Moreover, it can be foreseen that with a good MIMD parallel architecture which is immune to divergence problems, the speedups can be achieved more easily. The newer generation Nvidia graphics cards should also delivery stronger results. 6.2 Future Works Although this thesis limits its scope to the SIMD architecture CUDA platform. It should be interesting to conduct similar research and experiments on MIMD platforms and compare the results. Many of the divergence issues

56 CHAPTER 6. CONCLUSIONS 44 should go away and the performance boost should be closer to the theoretical improvement bound due to the ability to execute many paths concurrently. The other debacle of MCTS algorithms is that while overall it delivers greater results compared to traditional minimax tree search applied to Go, it is still far from top human professional players. Perhaps MCTS needs a stronger simulation policy heuristics based on entirely new paradigms[4] not limited to Go-specific rules and strategies. Machine learning[16][5] could also be applied to train the engine. On one hand, the playing strength of the MCTS Go engine is largely dependent on its simulation policy, domain specific knowledge and pattern matching etc; on the other hand, these factors can become limitations to improving the engine s playing strength as the Go-specific material could get so complicated that only Go-expert could produce. If ML is used to train the engine, these knowledge and patterns could potentially be learned without a Go-expert, thus making the development of policies easier. In this thesis only leaf node parallelization is explored. It should be interesting to experiment with root node parallelization; maybe it can have several trees running MCTS algorithms simultaneously[3] and merge the results to produce overall improvement over the single tree MCTS.

57 Appendix A Go Ranking Illustration Rank Rank Type Stage Double-digit kyu 30-20k Beginner Double-digit kyu 19-10k Casual Single-digit kyu 9-1k Intermediate amateur Amateur dan 1-7d Advanced player Professional dan 1-9p Professional player Table A.1: Go Ranking [Low-High] 45

58 Appendix B Go Board Struct struct{ int intersect[19*19]; // a positive number indicates black, negative for white, zero for empty int whitecapture; int blackcapture; int komi; int result; } 46

59 Appendix C Go Board Intersect The Go board is represented as a 1-dimensional array, with offset maps to an intersect. While the board can also be represented with a 2-dimensional array, it has slightly more footprint thus slightly slower performance. 1- dimensional array is functionally equivalent and just needs the program to be intelligent about switching rows. 47

60 APPENDIX C. GO BOARD INTERSECT 48 Figure C.1: Go Board Intersect

61 Bibliography [1] P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine learning, 47(2): , [2] P. Auer, N. Cesa-Bianchi, Y. Freund, and R.E. Schapire. Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Foundations of Computer Science, Proceedings., 36th Annual Symposium on, pages IEEE, [3] D. Auger. Multiple tree for partially observable monte-carlo tree search. Applications of Evolutionary Computation, pages 53 62, [4] V. Berthier, H. Doghmen, and O. Teytaud. Consistency modifications for automatically tuned monte-carlo tree search. Learning and Intelligent Optimization, pages , [5] B. Bouzy and G. Chaslot. Monte-carlo go reinforcement learning experiments. In Computational Intelligence and Games, 2006 IEEE Symposium on, pages IEEE,

62 BIBLIOGRAPHY 50 [6] C. Browne, E. Powley, D. Whitehouse, S. Lucas, P. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton. A survey of monte carlo tree search methods. Computational Intelligence and AI in Games, IEEE Transactions on, (99):1 1, [7] G. Chaslot. Monte-carlo tree search. PhD thesis, PhD thesis, Maastricht Univ, [8] G. Chaslot, M. Winands, and H. van Den Herik. Parallel monte-carlo tree search. Computers and Games, pages 60 71, [9] A. Fern and P. Lewis. Ensemble monte-carlo planning: An empirical study. In Proc. 21st Int. Conf. Automat. Plan. Sched., Freiburg, Germany, pages 58 65, [10] S. Gelly and Y. Wang. Exploration exploitation in go: Uct for montecarlo go [11] S.C. Huang, R. Coulom, and S.S. Lin. Time management for monte-carlo tree search applied to the game of go. In Technologies and Applications of Artificial Intelligence (TAAI), 2010 International Conference on, pages IEEE, [12] L. Kocsis, C. Szepesvári, and J. Willemson. Improved monte-carlo search. Univ. Tartu, Estonia, Tech. Rep, 1, [13] S. Lopez. Rybkas monte carlo analysis

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage Richard Kelly and David Churchill Computer Science Faculty of Science Memorial University {richard.kelly, dchurchill}@mun.ca

More information

A Bandit Approach for Tree Search

A Bandit Approach for Tree Search A An Example in Computer-Go Department of Statistics, University of Michigan March 27th, 2008 A 1 Bandit Problem K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem 2 Classical Tree Search UCT Algorithm

More information

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol Google DeepMind s AlphaGo vs. world Go champion Lee Sedol Review of Nature paper: Mastering the game of Go with Deep Neural Networks & Tree Search Tapani Raiko Thanks to Antti Tarvainen for some slides

More information

Available online at ScienceDirect. Procedia Computer Science 62 (2015 ) 31 38

Available online at  ScienceDirect. Procedia Computer Science 62 (2015 ) 31 38 Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 62 (2015 ) 31 38 The 2015 International Conference on Soft Computing and Software Engineering (SCSE 2015) Analysis of a

More information

Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku

Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku Guanlin Zhou (gz2250), Nan Yu (ny2263), Yanqing Dai (yd2369), Yingtao Zhong (yz3276) 1. Introduction: Reinforcement Learning for Gomoku

More information

MONTE-CARLO TWIXT. Janik Steinhauer. Master Thesis 10-08

MONTE-CARLO TWIXT. Janik Steinhauer. Master Thesis 10-08 MONTE-CARLO TWIXT Janik Steinhauer Master Thesis 10-08 Thesis submitted in partial fulfilment of the requirements for the degree of Master of Science of Artificial Intelligence at the Faculty of Humanities

More information

Monte Carlo Tree Search

Monte Carlo Tree Search Monte Carlo Tree Search 1 By the end, you will know Why we use Monte Carlo Search Trees The pros and cons of MCTS How it is applied to Super Mario Brothers and Alpha Go 2 Outline I. Pre-MCTS Algorithms

More information

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009 By David Anderson SZTAKI (Budapest, Hungary) WPI D2009 1997, Deep Blue won against Kasparov Average workstation can defeat best Chess players Computer Chess no longer interesting Go is much harder for

More information

Playing Othello Using Monte Carlo

Playing Othello Using Monte Carlo June 22, 2007 Abstract This paper deals with the construction of an AI player to play the game Othello. A lot of techniques are already known to let AI players play the game Othello. Some of these techniques

More information

Game-playing: DeepBlue and AlphaGo

Game-playing: DeepBlue and AlphaGo Game-playing: DeepBlue and AlphaGo Brief history of gameplaying frontiers 1990s: Othello world champions refuse to play computers 1994: Chinook defeats Checkers world champion 1997: DeepBlue defeats world

More information

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask Set 4: Game-Playing ICS 271 Fall 2017 Kalev Kask Overview Computer programs that play 2-player games game-playing as search with the complication of an opponent General principles of game-playing and search

More information

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH Santiago Ontañón so367@drexel.edu Recall: Adversarial Search Idea: When there is only one agent in the world, we can solve problems using DFS, BFS, ID,

More information

CS221 Project Final Report Gomoku Game Agent

CS221 Project Final Report Gomoku Game Agent CS221 Project Final Report Gomoku Game Agent Qiao Tan qtan@stanford.edu Xiaoti Hu xiaotihu@stanford.edu 1 Introduction Gomoku, also know as five-in-a-row, is a strategy board game which is traditionally

More information

More on games (Ch )

More on games (Ch ) More on games (Ch. 5.4-5.6) Alpha-beta pruning Previously on CSci 4511... We talked about how to modify the minimax algorithm to prune only bad searches (i.e. alpha-beta pruning) This rule of checking

More information

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta Computer Go: from the Beginnings to AlphaGo Martin Müller, University of Alberta 2017 Outline of the Talk Game of Go Short history - Computer Go from the beginnings to AlphaGo The science behind AlphaGo

More information

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel Foundations of AI 6. Adversarial Search Search Strategies for Games, Games with Chance, State of the Art Wolfram Burgard & Bernhard Nebel Contents Game Theory Board Games Minimax Search Alpha-Beta Search

More information

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar Monte Carlo Tree Search and AlphaGo Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar Zero-Sum Games and AI A player s utility gain or loss is exactly balanced by the combined gain or loss of opponents:

More information

Monte Carlo tree search techniques in the game of Kriegspiel

Monte Carlo tree search techniques in the game of Kriegspiel Monte Carlo tree search techniques in the game of Kriegspiel Paolo Ciancarini and Gian Piero Favini University of Bologna, Italy 22 IJCAI, Pasadena, July 2009 Agenda Kriegspiel as a partial information

More information

Application of UCT Search to the Connection Games of Hex, Y, *Star, and Renkula!

Application of UCT Search to the Connection Games of Hex, Y, *Star, and Renkula! Application of UCT Search to the Connection Games of Hex, Y, *Star, and Renkula! Tapani Raiko and Jaakko Peltonen Helsinki University of Technology, Adaptive Informatics Research Centre, P.O. Box 5400,

More information

Exploration exploitation in Go: UCT for Monte-Carlo Go

Exploration exploitation in Go: UCT for Monte-Carlo Go Exploration exploitation in Go: UCT for Monte-Carlo Go Sylvain Gelly(*) and Yizao Wang(*,**) (*)TAO (INRIA), LRI, UMR (CNRS - Univ. Paris-Sud) University of Paris-Sud, Orsay, France sylvain.gelly@lri.fr

More information

More on games (Ch )

More on games (Ch ) More on games (Ch. 5.4-5.6) Announcements Midterm next Tuesday: covers weeks 1-4 (Chapters 1-4) Take the full class period Open book/notes (can use ebook) ^^ No programing/code, internet searches or friends

More information

Monte Carlo Tree Search. Simon M. Lucas

Monte Carlo Tree Search. Simon M. Lucas Monte Carlo Tree Search Simon M. Lucas Outline MCTS: The Excitement! A tutorial: how it works Important heuristics: RAVE / AMAF Applications to video games and real-time control The Excitement Game playing

More information

AI Approaches to Ultimate Tic-Tac-Toe

AI Approaches to Ultimate Tic-Tac-Toe AI Approaches to Ultimate Tic-Tac-Toe Eytan Lifshitz CS Department Hebrew University of Jerusalem, Israel David Tsurel CS Department Hebrew University of Jerusalem, Israel I. INTRODUCTION This report is

More information

CS-E4800 Artificial Intelligence

CS-E4800 Artificial Intelligence CS-E4800 Artificial Intelligence Jussi Rintanen Department of Computer Science Aalto University March 9, 2017 Difficulties in Rational Collective Behavior Individual utility in conflict with collective

More information

Basic Introduction to Breakthrough

Basic Introduction to Breakthrough Basic Introduction to Breakthrough Carlos Luna-Mota Version 0. Breakthrough is a clever abstract game invented by Dan Troyka in 000. In Breakthrough, two uniform armies confront each other on a checkerboard

More information

Playout Search for Monte-Carlo Tree Search in Multi-Player Games

Playout Search for Monte-Carlo Tree Search in Multi-Player Games Playout Search for Monte-Carlo Tree Search in Multi-Player Games J. (Pim) A.M. Nijssen and Mark H.M. Winands Games and AI Group, Department of Knowledge Engineering, Faculty of Humanities and Sciences,

More information

Programming an Othello AI Michael An (man4), Evan Liang (liange)

Programming an Othello AI Michael An (man4), Evan Liang (liange) Programming an Othello AI Michael An (man4), Evan Liang (liange) 1 Introduction Othello is a two player board game played on an 8 8 grid. Players take turns placing stones with their assigned color (black

More information

The game of Reversi was invented around 1880 by two. Englishmen, Lewis Waterman and John W. Mollett. It later became

The game of Reversi was invented around 1880 by two. Englishmen, Lewis Waterman and John W. Mollett. It later became Reversi Meng Tran tranm@seas.upenn.edu Faculty Advisor: Dr. Barry Silverman Abstract: The game of Reversi was invented around 1880 by two Englishmen, Lewis Waterman and John W. Mollett. It later became

More information

Monte-Carlo Tree Search for the Simultaneous Move Game Tron

Monte-Carlo Tree Search for the Simultaneous Move Game Tron Monte-Carlo Tree Search for the Simultaneous Move Game Tron N.G.P. Den Teuling June 27, 2011 Abstract Monte-Carlo Tree Search (MCTS) has been successfully applied to many games, particularly in Go. In

More information

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Tom Pepels June 19, 2012 Abstract In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man.

More information

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1 Foundations of AI 5. Board Games Search Strategies for Games, Games with Chance, State of the Art Wolfram Burgard and Luc De Raedt SA-1 Contents Board Games Minimax Search Alpha-Beta Search Games with

More information

Five-In-Row with Local Evaluation and Beam Search

Five-In-Row with Local Evaluation and Beam Search Five-In-Row with Local Evaluation and Beam Search Jiun-Hung Chen and Adrienne X. Wang jhchen@cs axwang@cs Abstract This report provides a brief overview of the game of five-in-row, also known as Go-Moku,

More information

Igo Math Natural and Artificial Intelligence

Igo Math Natural and Artificial Intelligence Attila Egri-Nagy Igo Math Natural and Artificial Intelligence and the Game of Go V 2 0 1 9.0 2.1 4 These preliminary notes are being written for the MAT230 course at Akita International University in Japan.

More information

Adversarial Reasoning: Sampling-Based Search with the UCT algorithm. Joint work with Raghuram Ramanujan and Ashish Sabharwal

Adversarial Reasoning: Sampling-Based Search with the UCT algorithm. Joint work with Raghuram Ramanujan and Ashish Sabharwal Adversarial Reasoning: Sampling-Based Search with the UCT algorithm Joint work with Raghuram Ramanujan and Ashish Sabharwal Upper Confidence bounds for Trees (UCT) n The UCT algorithm (Kocsis and Szepesvari,

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 6. Board Games Search Strategies for Games, Games with Chance, State of the Art Joschka Boedecker and Wolfram Burgard and Frank Hutter and Bernhard Nebel Albert-Ludwigs-Universität

More information

Building Opening Books for 9 9 Go Without Relying on Human Go Expertise

Building Opening Books for 9 9 Go Without Relying on Human Go Expertise Journal of Computer Science 8 (10): 1594-1600, 2012 ISSN 1549-3636 2012 Science Publications Building Opening Books for 9 9 Go Without Relying on Human Go Expertise 1 Keh-Hsun Chen and 2 Peigang Zhang

More information

Creating a Havannah Playing Agent

Creating a Havannah Playing Agent Creating a Havannah Playing Agent B. Joosten August 27, 2009 Abstract This paper delves into the complexities of Havannah, which is a 2-person zero-sum perfectinformation board game. After determining

More information

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games? Contents Foundations of Artificial Intelligence 6. Board Games Search Strategies for Games, Games with Chance, State of the Art Wolfram Burgard, Bernhard Nebel, and Martin Riedmiller Albert-Ludwigs-Universität

More information

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46.

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46. Foundations of Artificial Intelligence May 30, 2016 46. AlphaGo and Outlook Foundations of Artificial Intelligence 46. AlphaGo and Outlook Thomas Keller Universität Basel May 30, 2016 46.1 Introduction

More information

AN ABSTRACT OF THE THESIS OF

AN ABSTRACT OF THE THESIS OF AN ABSTRACT OF THE THESIS OF Paul Lewis for the degree of Master of Science in Computer Science presented on June 1, 2010. Title: Ensemble Monte-Carlo Planning: An Empirical Study Abstract approved: Alan

More information

Andrei Behel AC-43И 1

Andrei Behel AC-43И 1 Andrei Behel AC-43И 1 History The game of Go originated in China more than 2,500 years ago. The rules of the game are simple: Players take turns to place black or white stones on a board, trying to capture

More information

Move Evaluation Tree System

Move Evaluation Tree System Move Evaluation Tree System Hiroto Yoshii hiroto-yoshii@mrj.biglobe.ne.jp Abstract This paper discloses a system that evaluates moves in Go. The system Move Evaluation Tree System (METS) introduces a tree

More information

Heuristic Move Pruning in Monte Carlo Tree Search for the Strategic Card Game Lords of War

Heuristic Move Pruning in Monte Carlo Tree Search for the Strategic Card Game Lords of War Heuristic Move Pruning in Monte Carlo Tree Search for the Strategic Card Game Lords of War Nick Sephton, Peter I. Cowling, Edward Powley, and Nicholas H. Slaven York Centre for Complex Systems Analysis,

More information

The game of Paco Ŝako

The game of Paco Ŝako The game of Paco Ŝako Created to be an expression of peace, friendship and collaboration, Paco Ŝako is a new and dynamic chess game, with a mindful touch, and a mind-blowing gameplay. Two players sitting

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 6. Board Games Search Strategies for Games, Games with Chance, State of the Art Joschka Boedecker and Wolfram Burgard and Bernhard Nebel Albert-Ludwigs-Universität

More information

Improving MCTS and Neural Network Communication in Computer Go

Improving MCTS and Neural Network Communication in Computer Go Improving MCTS and Neural Network Communication in Computer Go Joshua Keller Oscar Perez Worcester Polytechnic Institute a Major Qualifying Project Report submitted to the faculty of Worcester Polytechnic

More information

Small and large MCTS playouts applied to Chinese Dark Chess stochastic game

Small and large MCTS playouts applied to Chinese Dark Chess stochastic game Small and large MCTS playouts applied to Chinese Dark Chess stochastic game Nicolas Jouandeau 1 and Tristan Cazenave 2 1 LIASD, Université de Paris 8, France n@ai.univ-paris8.fr 2 LAMSADE, Université Paris-Dauphine,

More information

An AI for Dominion Based on Monte-Carlo Methods

An AI for Dominion Based on Monte-Carlo Methods An AI for Dominion Based on Monte-Carlo Methods by Jon Vegard Jansen and Robin Tollisen Supervisors: Morten Goodwin, Associate Professor, Ph.D Sondre Glimsdal, Ph.D Fellow June 2, 2014 Abstract To the

More information

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Tom Pepels Mark H.M. Winands Abstract In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man.

More information

Foundations of AI. 6. Board Games. Search Strategies for Games, Games with Chance, State of the Art

Foundations of AI. 6. Board Games. Search Strategies for Games, Games with Chance, State of the Art Foundations of AI 6. Board Games Search Strategies for Games, Games with Chance, State of the Art Wolfram Burgard, Andreas Karwath, Bernhard Nebel, and Martin Riedmiller SA-1 Contents Board Games Minimax

More information

Sokoban: Reversed Solving

Sokoban: Reversed Solving Sokoban: Reversed Solving Frank Takes (ftakes@liacs.nl) Leiden Institute of Advanced Computer Science (LIACS), Leiden University June 20, 2008 Abstract This article describes a new method for attempting

More information

FreeCiv Learner: A Machine Learning Project Utilizing Genetic Algorithms

FreeCiv Learner: A Machine Learning Project Utilizing Genetic Algorithms FreeCiv Learner: A Machine Learning Project Utilizing Genetic Algorithms Felix Arnold, Bryan Horvat, Albert Sacks Department of Computer Science Georgia Institute of Technology Atlanta, GA 30318 farnold3@gatech.edu

More information

CS 229 Final Project: Using Reinforcement Learning to Play Othello

CS 229 Final Project: Using Reinforcement Learning to Play Othello CS 229 Final Project: Using Reinforcement Learning to Play Othello Kevin Fry Frank Zheng Xianming Li ID: kfry ID: fzheng ID: xmli 16 December 2016 Abstract We built an AI that learned to play Othello.

More information

CSC321 Lecture 23: Go

CSC321 Lecture 23: Go CSC321 Lecture 23: Go Roger Grosse Roger Grosse CSC321 Lecture 23: Go 1 / 21 Final Exam Friday, April 20, 9am-noon Last names A Y: Clara Benson Building (BN) 2N Last names Z: Clara Benson Building (BN)

More information

Game-Playing & Adversarial Search

Game-Playing & Adversarial Search Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search (two lectures) Chapter 5.1-5.5 Next lecture topic: Constraint Satisfaction Problems (two lectures) Chapter 6.1-6.4,

More information

Feature Learning Using State Differences

Feature Learning Using State Differences Feature Learning Using State Differences Mesut Kirci and Jonathan Schaeffer and Nathan Sturtevant Department of Computing Science University of Alberta Edmonton, Alberta, Canada {kirci,nathanst,jonathan}@cs.ualberta.ca

More information

Procedural Play Generation According to Play Arcs Using Monte-Carlo Tree Search

Procedural Play Generation According to Play Arcs Using Monte-Carlo Tree Search Proc. of the 18th International Conference on Intelligent Games and Simulation (GAME-ON'2017), Carlow, Ireland, pp. 67-71, Sep. 6-8, 2017. Procedural Play Generation According to Play Arcs Using Monte-Carlo

More information

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi Learning to Play like an Othello Master CS 229 Project Report December 13, 213 1 Abstract This project aims to train a machine to strategically play the game of Othello using machine learning. Prior to

More information

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games CPS 57: Artificial Intelligence Two-player, zero-sum, perfect-information Games Instructor: Vincent Conitzer Game playing Rich tradition of creating game-playing programs in AI Many similarities to search

More information

Learning from Hints: AI for Playing Threes

Learning from Hints: AI for Playing Threes Learning from Hints: AI for Playing Threes Hao Sheng (haosheng), Chen Guo (cguo2) December 17, 2016 1 Introduction The highly addictive stochastic puzzle game Threes by Sirvo LLC. is Apple Game of the

More information

AN MCTS AGENT FOR EINSTEIN WÜRFELT NICHT! Emanuel Oster. Master Thesis DKE 15-19

AN MCTS AGENT FOR EINSTEIN WÜRFELT NICHT! Emanuel Oster. Master Thesis DKE 15-19 AN MCTS AGENT FOR EINSTEIN WÜRFELT NICHT! Emanuel Oster Master Thesis DKE 15-19 Thesis submitted in partial fulfilment of the requirements for the degree of Master of Science of Artificial Intelligence

More information

Monte-Carlo Tree Search and Minimax Hybrids with Heuristic Evaluation Functions

Monte-Carlo Tree Search and Minimax Hybrids with Heuristic Evaluation Functions Monte-Carlo Tree Search and Minimax Hybrids with Heuristic Evaluation Functions Hendrik Baier and Mark H.M. Winands Games and AI Group, Department of Knowledge Engineering Faculty of Humanities and Sciences,

More information

Approximate matching for Go board positions

Approximate matching for Go board positions Approximate matching for Go board positions Alonso GRAGERA The University of Tokyo, JAPAN alonso@is.s.u-tokyo.ac.jp Abstract. Knowledge is crucial for being successful in playing Go, and this remains true

More information

Artificial Intelligence. Minimax and alpha-beta pruning

Artificial Intelligence. Minimax and alpha-beta pruning Artificial Intelligence Minimax and alpha-beta pruning In which we examine the problems that arise when we try to plan ahead to get the best result in a world that includes a hostile agent (other agent

More information

Virtual Global Search: Application to 9x9 Go

Virtual Global Search: Application to 9x9 Go Virtual Global Search: Application to 9x9 Go Tristan Cazenave LIASD Dept. Informatique Université Paris 8, 93526, Saint-Denis, France cazenave@ai.univ-paris8.fr Abstract. Monte-Carlo simulations can be

More information

Red Shadow. FPGA Trax Design Competition

Red Shadow. FPGA Trax Design Competition Design Competition placing: Red Shadow (Qing Lu, Bruce Chiu-Wing Sham, Francis C.M. Lau) for coming third equal place in the FPGA Trax Design Competition International Conference on Field Programmable

More information

TD-Leaf(λ) Giraffe: Using Deep Reinforcement Learning to Play Chess. Stefan Lüttgen

TD-Leaf(λ) Giraffe: Using Deep Reinforcement Learning to Play Chess. Stefan Lüttgen TD-Leaf(λ) Giraffe: Using Deep Reinforcement Learning to Play Chess Stefan Lüttgen Motivation Learn to play chess Computer approach different than human one Humans search more selective: Kasparov (3-5

More information

Opleiding Informatica

Opleiding Informatica Opleiding Informatica Agents for the card game of Hearts Joris Teunisse Supervisors: Walter Kosters, Jeanette de Graaf BACHELOR THESIS Leiden Institute of Advanced Computer Science (LIACS) www.liacs.leidenuniv.nl

More information

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I Adversarial Search and Game- Playing C H A P T E R 6 C M P T 3 1 0 : S P R I N G 2 0 1 1 H A S S A N K H O S R A V I Adversarial Search Examine the problems that arise when we try to plan ahead in a world

More information

Adversarial Search: Game Playing. Reading: Chapter

Adversarial Search: Game Playing. Reading: Chapter Adversarial Search: Game Playing Reading: Chapter 6.5-6.8 1 Games and AI Easy to represent, abstract, precise rules One of the first tasks undertaken by AI (since 1950) Better than humans in Othello and

More information

A Study of UCT and its Enhancements in an Artificial Game

A Study of UCT and its Enhancements in an Artificial Game A Study of UCT and its Enhancements in an Artificial Game David Tom and Martin Müller Department of Computing Science, University of Alberta, Edmonton, Canada, T6G 2E8 {dtom, mmueller}@cs.ualberta.ca Abstract.

More information

Probability of Potential Model Pruning in Monte-Carlo Go

Probability of Potential Model Pruning in Monte-Carlo Go Available online at www.sciencedirect.com Procedia Computer Science 6 (211) 237 242 Complex Adaptive Systems, Volume 1 Cihan H. Dagli, Editor in Chief Conference Organized by Missouri University of Science

More information

CUDA Threads. Terminology. How it works. Terminology. Streaming Multiprocessor (SM) A SM processes block of threads

CUDA Threads. Terminology. How it works. Terminology. Streaming Multiprocessor (SM) A SM processes block of threads Terminology CUDA Threads Bedrich Benes, Ph.D. Purdue University Department of Computer Graphics Streaming Multiprocessor (SM) A SM processes block of threads Streaming Processors (SP) also called CUDA

More information

SCRABBLE ARTIFICIAL INTELLIGENCE GAME. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University

SCRABBLE ARTIFICIAL INTELLIGENCE GAME. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University SCRABBLE AI GAME 1 SCRABBLE ARTIFICIAL INTELLIGENCE GAME CS 297 Report Presented to Dr. Chris Pollett Department of Computer Science San Jose State University In Partial Fulfillment Of the Requirements

More information

Third year Project School of Computer Science University of Manchester Chess Game

Third year Project School of Computer Science University of Manchester Chess Game Third year Project School of Computer Science University of Manchester Chess Game Author: Adrian Moldovan Supervisor: Milan Mihajlovic Degree: MenG Computer Science with IE Date of submission: 28.04.2015

More information

Game Specific Approaches to Monte Carlo Tree Search for Dots and Boxes

Game Specific Approaches to Monte Carlo Tree Search for Dots and Boxes Western Kentucky University TopSCHOLAR Honors College Capstone Experience/Thesis Projects Honors College at WKU 6-28-2017 Game Specific Approaches to Monte Carlo Tree Search for Dots and Boxes Jared Prince

More information

a b c d e f g h i j k l m n

a b c d e f g h i j k l m n Shoebox, page 1 In his book Chess Variants & Games, A. V. Murali suggests playing chess on the exterior surface of a cube. This playing surface has intriguing properties: We can think of it as three interlocked

More information

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( )

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( ) COMP3211 Project Artificial Intelligence for Tron game Group 7 Chiu Ka Wa (20369737) Chun Wai Wong (20265022) Ku Chun Kit (20123470) Abstract Tron is an old and popular game based on a movie of the same

More information

Artificial Intelligence. Topic 5. Game playing

Artificial Intelligence. Topic 5. Game playing Artificial Intelligence Topic 5 Game playing broadening our world view dealing with incompleteness why play games? perfect decisions the Minimax algorithm dealing with resource limits evaluation functions

More information

A Complex Systems Introduction to Go

A Complex Systems Introduction to Go A Complex Systems Introduction to Go Eric Jankowski CSAAW 10-22-2007 Background image by Juha Nieminen Wei Chi, Go, Baduk... Oldest board game in the world (maybe) Developed by Chinese monks Spread to

More information

The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games

The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games Proceedings of the Ninth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games Santiago

More information

TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS

TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS Thong B. Trinh, Anwer S. Bashi, Nikhil Deshpande Department of Electrical Engineering University of New Orleans New Orleans, LA 70148 Tel: (504) 280-7383 Fax:

More information

Games and Adversarial Search II

Games and Adversarial Search II Games and Adversarial Search II Alpha-Beta Pruning (AIMA 5.3) Some slides adapted from Richard Lathrop, USC/ISI, CS 271 Review: The Minimax Rule Idea: Make the best move for MAX assuming that MIN always

More information

Generalized Game Trees

Generalized Game Trees Generalized Game Trees Richard E. Korf Computer Science Department University of California, Los Angeles Los Angeles, Ca. 90024 Abstract We consider two generalizations of the standard two-player game

More information

ON THE TACTICAL AND STRATEGIC BEHAVIOUR OF MCTS WHEN BIASING RANDOM SIMULATIONS

ON THE TACTICAL AND STRATEGIC BEHAVIOUR OF MCTS WHEN BIASING RANDOM SIMULATIONS On the tactical and strategic behaviour of MCTS when biasing random simulations 67 ON THE TACTICAL AND STATEGIC BEHAVIOU OF MCTS WHEN BIASING ANDOM SIMULATIONS Fabien Teytaud 1 Julien Dehos 2 Université

More information

SEARCHING is both a method of solving problems and

SEARCHING is both a method of solving problems and 100 IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 3, NO. 2, JUNE 2011 Two-Stage Monte Carlo Tree Search for Connect6 Shi-Jim Yen, Member, IEEE, and Jung-Kuei Yang Abstract Recently,

More information

Mind Ninja The Game of Boundless Forms

Mind Ninja The Game of Boundless Forms Mind Ninja The Game of Boundless Forms Nick Bentley 2007-2008. email: nickobento@gmail.com Overview Mind Ninja is a deep board game for two players. It is 2007 winner of the prestigious international board

More information

Game Playing for a Variant of Mancala Board Game (Pallanguzhi)

Game Playing for a Variant of Mancala Board Game (Pallanguzhi) Game Playing for a Variant of Mancala Board Game (Pallanguzhi) Varsha Sankar (SUNet ID: svarsha) 1. INTRODUCTION Game playing is a very interesting area in the field of Artificial Intelligence presently.

More information

5.4 Imperfect, Real-Time Decisions

5.4 Imperfect, Real-Time Decisions 5.4 Imperfect, Real-Time Decisions Searching through the whole (pruned) game tree is too inefficient for any realistic game Moves must be made in a reasonable amount of time One has to cut off the generation

More information

43.1 Introduction. Foundations of Artificial Intelligence Introduction Monte-Carlo Methods Monte-Carlo Tree Search. 43.

43.1 Introduction. Foundations of Artificial Intelligence Introduction Monte-Carlo Methods Monte-Carlo Tree Search. 43. May 6, 20 3. : Introduction 3. : Introduction Malte Helmert University of Basel May 6, 20 3. Introduction 3.2 3.3 3. Summary May 6, 20 / 27 May 6, 20 2 / 27 Board Games: Overview 3. : Introduction Introduction

More information

UNIT 13A AI: Games & Search Strategies. Announcements

UNIT 13A AI: Games & Search Strategies. Announcements UNIT 13A AI: Games & Search Strategies 1 Announcements Do not forget to nominate your favorite CA bu emailing gkesden@gmail.com, No lecture on Friday, no recitation on Thursday No office hours Wednesday,

More information

CPS331 Lecture: Search in Games last revised 2/16/10

CPS331 Lecture: Search in Games last revised 2/16/10 CPS331 Lecture: Search in Games last revised 2/16/10 Objectives: 1. To introduce mini-max search 2. To introduce the use of static evaluation functions 3. To introduce alpha-beta pruning Materials: 1.

More information

How AI Won at Go and So What? Garry Kasparov vs. Deep Blue (1997)

How AI Won at Go and So What? Garry Kasparov vs. Deep Blue (1997) How AI Won at Go and So What? Garry Kasparov vs. Deep Blue (1997) Alan Fern School of Electrical Engineering and Computer Science Oregon State University Deep Mind s vs. Lee Sedol (2016) Watson vs. Ken

More information

CS 771 Artificial Intelligence. Adversarial Search

CS 771 Artificial Intelligence. Adversarial Search CS 771 Artificial Intelligence Adversarial Search Typical assumptions Two agents whose actions alternate Utility values for each agent are the opposite of the other This creates the adversarial situation

More information

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5 Adversarial Search CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5 Outline Game

More information

Artificial Intelligence Search III

Artificial Intelligence Search III Artificial Intelligence Search III Lecture 5 Content: Search III Quick Review on Lecture 4 Why Study Games? Game Playing as Search Special Characteristics of Game Playing Search Ingredients of 2-Person

More information

Ar#ficial)Intelligence!!

Ar#ficial)Intelligence!! Introduc*on! Ar#ficial)Intelligence!! Roman Barták Department of Theoretical Computer Science and Mathematical Logic So far we assumed a single-agent environment, but what if there are more agents and

More information

CS 387: GAME AI BOARD GAMES

CS 387: GAME AI BOARD GAMES CS 387: GAME AI BOARD GAMES 5/28/2015 Instructor: Santiago Ontañón santi@cs.drexel.edu Class website: https://www.cs.drexel.edu/~santi/teaching/2015/cs387/intro.html Reminders Check BBVista site for the

More information

CS 4700: Foundations of Artificial Intelligence

CS 4700: Foundations of Artificial Intelligence CS 4700: Foundations of Artificial Intelligence selman@cs.cornell.edu Module: Adversarial Search R&N: Chapter 5 Part II 1 Outline Game Playing Optimal decisions Minimax α-β pruning Case study: Deep Blue

More information

UNIT 13A AI: Games & Search Strategies

UNIT 13A AI: Games & Search Strategies UNIT 13A AI: Games & Search Strategies 1 Artificial Intelligence Branch of computer science that studies the use of computers to perform computational processes normally associated with human intellect

More information