A Bandit Approach for Tree Search

Size: px
Start display at page:

Download "A Bandit Approach for Tree Search"

Transcription

1 A An Example in Computer-Go Department of Statistics, University of Michigan March 27th, 2008 A

2 1 Bandit Problem K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem 2 Classical Tree Search UCT Algorithm 3 Quick Introduction to Go Game and Computer-Go A

3 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem Bandit Problem A

4 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem K-Armed Bandit A machine with K arms Playing each arm leads to a random reward The mean of reward of each arm is fixed but unknown The rewards at each round are independent from the others Objective: maximize the total wins for n rounds Round Left Right Reward 1 1 X 1,1 A

5 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem K-Armed Bandit A machine with K arms Playing each arm leads to a random reward The mean of reward of each arm is fixed but unknown The rewards at each round are independent from the others Objective: maximize the total wins for n rounds Round Left Right Reward 1 1 X 1,1 2 4 X 2,1 A

6 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem K-Armed Bandit A machine with K arms Playing each arm leads to a random reward The mean of reward of each arm is fixed but unknown The rewards at each round are independent from the others Objective: maximize the total wins for n rounds Round Left Right Reward 1 1 X 1,1 2 4 X 2,1 3 3 X 2,2 A

7 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem K-Armed Bandit A machine with K arms Playing each arm leads to a random reward The mean of reward of each arm is fixed but unknown The rewards at each round are independent from the others Objective: maximize the total wins for n rounds Round Left Right Reward 1 1 X 1,1 2 4 X 2,1 3 3 X 2,2 4 4 X 2,3 A

8 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem K-Armed Bandit A machine with K arms Playing each arm leads to a random reward The mean of reward of each arm is fixed but unknown The rewards at each round are independent from the others Objective: maximize the total wins for n rounds Round Left Right Reward 1 1 X 1,1 2 4 X 2,1 3 3 X 2,2 4 4 X 2,3 5 5 X 2,4 A

9 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem K-Armed Bandit A machine with K arms Playing each arm leads to a random reward The mean of reward of each arm is fixed but unknown The rewards at each round are independent from the others Objective: maximize the total wins for n rounds Round Left Right Reward 1 1 X 1,1 2 4 X 2,1 3 3 X 2,2 4 4 X 2,3 5 5 X 2,4 6 1 X 1,2 A

10 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem Formalization of K-Armed Bandit Problem A machine with K arms The i-th play of arm k brings reward X k,i. X k,1, X k,2, are i.i.d. with EX k,i = µ k. µ k unknown. X k,i [0, a]. X k1,i 1 and X k2,i 2 are independent when k 1 k 2. T k (n) : the number of plays of arm k until round n. Objective: Maximize reward in n rounds, strategy at each round n based only on rewards of previous plays. A

11 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem Formalization of K-Armed Bandit Problem A machine with K arms The i-th play of arm k brings reward X k,i. X k,1, X k,2, are i.i.d. with EX k,i = µ k. µ k unknown. X k,i [0, a]. X k1,i 1 and X k2,i 2 are independent when k 1 k 2. T k (n) : the number of plays of arm k until round n. Objective: Maximize reward in n rounds, strategy at each round n based only on rewards of previous plays. Exploitation-Exploration Dilemma Exploitation: playing arms with high average rewards ensuring promising rewards in future: ˆk argmax k X k,n 1 where X k,n = 1 Tk (n) T k (n) i=1 X k,i Exploration: playing arms with few plays, in order to get more information: k such that T k (n 1) is small. A

12 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem Formalization of K-Armed Bandit Problem (Cont d) T k (n) : the number of plays of arm k until round n. k := argmax k µ k, µ := max k µ k, k := µ µ k. Definition of regret up to n rounds n R }{{} n := X k,i i=1 Regret }{{} reward by optimal plays Objective: minimize ER n = nµ k k µ k ET k (n) = K k=1 T k (n) i=1 X k,i }{{} reward by real plays k k k ET k (n) A

13 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem UCB1 (Upper Confidence Bound) Algorithm by Auer et al Suppose X k [0, b]. First: play each arm once After: play arm k at round n such that k maximize b X k,n + 2 log n T k (n 1) A

14 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem UCB1 (Upper Confidence Bound) Algorithm by Auer et al Suppose X k [0, b]. First: play each arm once After: play arm k at round n such that k maximize b X k,n + 2 log n T k (n 1) Remarks Every arm is ultimately played infinite times, otherwise if k is played finite times, then b 2 log n T k (n 1) as n. At round n, any arm k k, T k (n) = O(log n). A

15 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem UCB1 (Upper Confidence Bound) Algorithm by Auer et al Suppose X k [0, b]. First: play each arm once After: play arm k at round n such that k maximize b X k,n + 2 log n T k (n 1) Regret Bound for UCB1 ER n k k ( 256Vk k + 8 k ) log n + O(1) A

16 K-Armed Bandit UCB Algorithms for K-Armed Bandit Problem UCB1 (Upper Confidence Bound) Algorithm by Auer et al Suppose X k [0, b]. First: play each arm once After: play arm k at round n such that k maximize b X k,n + 2 log n T k (n 1) Regret Bound for UCB1 ER n k k ( 256Vk k + 8 k ) log n + O(1) UCB-V Algorithm by Audibert et al A refined version using empirical variances. A

17 Classical Tree Search UCT Algorithm A

18 Classical Tree Search UCT Algorithm Tree Search Settings Structure of tree: one root, nodes, leafs The value is only known at leafs The value of any node is a function of values of its child-nodes (max tree/minimax tree) The value of any node can be computed in an iterative way An example of minimax tree. A

19 Classical Tree Search UCT Algorithm Tree Search Settings Structure of tree: one root, nodes, leafs The value is only known at leafs The value of any node is a function of values of its child-nodes (max tree/minimax tree) The value of any node can be computed in an iterative way Problem arises when the iterative search can not be completed When search can not be completed. A

20 Classical Tree Search UCT Algorithm Instead of exhaustively searching each branch and get the exact value of each node, we estimate the value... Model each node as a bandit machine At each node, decide exploration or exploitation by bandit algorithm The more a node is exploited, the preciser is its (estimated) value Exploration and Exploitation at Each Node Exploitation: descend to a node with promising value Exploration: descend to a node to get more information A

21 Classical Tree Search UCT Algorithm A

22 Classical Tree Search UCT Algorithm A

23 Classical Tree Search UCT Algorithm A

24 Classical Tree Search UCT Algorithm A

25 Classical Tree Search UCT Algorithm UCT (UCB for Tree) Algorithm (Kocsis and Szepesvari. 2006) start from the root loop until arriving at a leaf choose a child-node according to UCB and descend get the value of the leaf update all visited nodes with the value A

26 Classical Tree Search UCT Algorithm Formalization of UCT The visit times of node i: n i The value of leaf j: X j,nj = 1 nj n j k=1 X j,k The set of child-nodes of node i: C(i) The set of leafs of branch starting at node i: L(i) The value of each node: X i,ni = 1 n i j L(i) n j X j,nj UCB at node i: play ĵ such that log n i ĵ argmax j C(i) Xj,nj + n j A

27 Classical Tree Search UCT Algorithm Formalization of UCT The visit times of node i: n i The value of leaf j: X j,nj = 1 nj n j k=1 X j,k The set of child-nodes of node i: C(i) The set of leafs of branch starting at node i: L(i) The value of each node: X i,ni = 1 n i j L(i) n j X j,nj true value UCB at node i: play ĵ such that log n i ĵ argmax j C(i) Xj,nj + n j A

28 Classical Tree Search UCT Algorithm Remarks The value of each node estimates and converges to its true value. Under smoothness assumption, fast convergence rate is expected. No cut (pruning). Every node will ultimately be visited. The tree is explored in an asymmetric way. The order of exploration is always the key point. A

29 Quick Introduction to Go Game and Computer-Go A

30 Quick Introduction to Go Game and Computer-Go A Quick Introduction to the Game of Go A

31 Quick Introduction to Go Game and Computer-Go A Quick Introduction to Go Game Go-board (Goban): (pedagogical Go-board 9 9) Black and White play alternatively. Black starts the game Adjacent stones are called a string. Liberties are the empty intersections next to the string Stones do not move, there are only added and removed from the board. A string is removed iif its number of liberties is 0 Score: territory (number of occupied or surrounded intersections). A

32 Quick Introduction to Go Game and Computer-Go History of Computer-Go Beginning of Computer-Go, 1970s Classical methods Expert-knowledge-based evaluation function Minimax tree search Comparison with chess Chess: Deeper Blue won against Kasparov, 1997 Go: The strongest programs are about 10kyu until 2006 (amateurs of good level can win with 9 stones handicap) 2006, UCT introduced in Computer-Go Today best programs around 1dan on A

33 Quick Introduction to Go Game and Computer-Go Difficulties in Computer-Go Techniques developed for Computer-Chess do not work for Computer-Go Huge branching factor 200 (chess 40), large depth > 300 Legal positions number (J. Tromp and G. Farneback, 2006) on 19 19, on 9 9 Good evaluation function difficult to build A

34 Quick Introduction to Go Game and Computer-Go Minimax tree structure. A

35 Quick Introduction to Go Game and Computer-Go Minimax tree structure. Given the huge tree size (depth > 300), what to do with the nodes rarely visited? A

36 Quick Introduction to Go Game and Computer-Go Memory management (R. Coulum 2006) Idea: no need to save the nodes rarely visited Tree (in memory) starts with only the root node In each simulation (path), save the first node not yet in the tree The rest of path is not saved and randomly chosen. A

37 Quick Introduction to Go Game and Computer-Go Memory management (R. Coulum 2006) Monte-Carlo evaluation function for each node (B. Brugmann, 1993) Score (0 or 1) A

38 Quick Introduction to Go Game and Computer-Go Memory management (R. Coulum 2006) Monte-Carlo evaluation function for each node (B. Brugmann, 1993) Random order better than fixed order. How to improve (to find intelligent order)? A

39 Quick Introduction to Go Game and Computer-Go One trend of Computer-Go since 2006, motivated by Crazy Stone (R. Coulom 2006) and MoGo (Y. Wang and S. Gelly 2007) MoGo won the Golden Medal of Go in Computer Game Olympics 2007 A

40 Quick Introduction to Go Game and Computer-Go Why it is efficient compared to Alpha-Beta? Alpha-Beta never reconsider a cut: dangerous when random reward and no accurate evaluation function Estimation vs. computation Efficient tree exploration breadth first search move ordering efficiently managed (for often visited nodes) asymmetric growth Anytime A

41 Quick Introduction to Go Game and Computer-Go Why it is efficient compared to Alpha-Beta? Alpha-Beta never reconsider a cut: dangerous when random reward and no accurate evaluation function Estimation vs. computation Efficient tree exploration breadth first search move ordering efficiently managed (for often visited nodes) asymmetric growth Anytime Remark UCT is far from the final solution. The current tree search algorithms of best programs are different from the one presented today. A

42 Quick Introduction to Go Game and Computer-Go Improvement of Random Plays by Patterns A

43 Quick Introduction to Go Game and Computer-Go Improvement of Random Plays by Patterns Examples of Pattern A

44 Quick Introduction to Go Game and Computer-Go How to improve the quality of random simulation? Using hand-made patterns (Y. Wang and S. Gelly 2007) A

45 Quick Introduction to Go Game and Computer-Go How to improve the quality of random simulation? Using hand-made patterns (Y. Wang and S. Gelly 2007) How to learn patterns off-line? Collecting and ranking patterns (R. Coulom 2007) A

46 Quick Introduction to Go Game and Computer-Go How to improve the quality of random simulation? Using hand-made patterns (Y. Wang and S. Gelly 2007) How to learn patterns off-line? Collecting and ranking patterns (R. Coulom 2007) How to get information on-line? Learning from previous simulations (S. Gelly, D. Silver 2007) Order of search is important! A

47 Quick Introduction to Go Game and Computer-Go How to improve the quality of random simulation? Using hand-made patterns (Y. Wang and S. Gelly 2007) How to learn patterns off-line? Collecting and ranking patterns (R. Coulom 2007) How to get information on-line? Learning from previous simulations (S. Gelly, D. Silver 2007) Order of search is important! How to find evaluation function with high quality? A

48 Quick Introduction to Go Game and Computer-Go How to improve the quality of random simulation? Using hand-made patterns (Y. Wang and S. Gelly 2007) How to learn patterns off-line? Collecting and ranking patterns (R. Coulom 2007) How to get information on-line? Learning from previous simulations (S. Gelly, D. Silver 2007) Order of search is important! How to find evaluation function with high quality? Other Approaches Parallelization: MPI MoGo A

49 Quick Introduction to Go Game and Computer-Go References on Bandit Problem P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2/3), pages , J.-Y. Audibert, R. Munos, and C. Szepesvàri. Tuning Bandit Algorithms in Stochastic Environments. In M. Hutter, R. A. Servedio, and E. Takimoto, editors, ALT, volume 4754 of Lecture Notes in Computer Science, pages Springer, L. Kocsis and C. Szepesvari. Bandit Based Monte-Carlo Planning. In 15th European Conference on Machine Learning (ECML), pages , P.-A. Coquelin and R. Munos. Bandit Algorithm for Tree Search. 23rd Conference on Uncertainty in Artificial Intelligence (UAI), A

50 Quick Introduction to Go Game and Computer-Go References on Computer-Go Y. Wang and S. Gelly. Modification of UCT with Patterns for Monte-Carlo Go, Computational Intelligence and Games, 2007 p S. Gelly and D. Silver. Combining Online and Offline Knowledge in UCT ICML 2007, Corvallis Oregon USA, p R. Coulom. Computing Elo Ratings of Move Patterns in the Game of Go In H. Jaap van den Herik, Jos W. H. M. Uiterwijk, Mark Winands and Maarten Schadd editors, Computer Games Workshop, Amsterdam, The Netherlands, A

51 Quick Introduction to Go Game and Computer-Go Acknowledgment Thanks to Rémi Munos, Sylvain Gelly and Rémi Coulom A

52 Quick Introduction to Go Game and Computer-Go Thank you! A

Exploration exploitation in Go: UCT for Monte-Carlo Go

Exploration exploitation in Go: UCT for Monte-Carlo Go Exploration exploitation in Go: UCT for Monte-Carlo Go Sylvain Gelly(*) and Yizao Wang(*,**) (*)TAO (INRIA), LRI, UMR (CNRS - Univ. Paris-Sud) University of Paris-Sud, Orsay, France sylvain.gelly@lri.fr

More information

Recent Progress in Computer Go. Martin Müller University of Alberta Edmonton, Canada

Recent Progress in Computer Go. Martin Müller University of Alberta Edmonton, Canada Recent Progress in Computer Go Martin Müller University of Alberta Edmonton, Canada 40 Years of Computer Go 1960 s: initial ideas 1970 s: first serious program - Reitman & Wilcox 1980 s: first PC programs,

More information

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009 By David Anderson SZTAKI (Budapest, Hungary) WPI D2009 1997, Deep Blue won against Kasparov Average workstation can defeat best Chess players Computer Chess no longer interesting Go is much harder for

More information

A Parallel Monte-Carlo Tree Search Algorithm

A Parallel Monte-Carlo Tree Search Algorithm A Parallel Monte-Carlo Tree Search Algorithm Tristan Cazenave and Nicolas Jouandeau LIASD, Université Paris 8, 93526, Saint-Denis, France cazenave@ai.univ-paris8.fr n@ai.univ-paris8.fr Abstract. Monte-Carlo

More information

Application of UCT Search to the Connection Games of Hex, Y, *Star, and Renkula!

Application of UCT Search to the Connection Games of Hex, Y, *Star, and Renkula! Application of UCT Search to the Connection Games of Hex, Y, *Star, and Renkula! Tapani Raiko and Jaakko Peltonen Helsinki University of Technology, Adaptive Informatics Research Centre, P.O. Box 5400,

More information

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2010-GI-24 No /6/25 UCT UCT UCT UCB A new UCT search method using position evaluation function an

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2010-GI-24 No /6/25 UCT UCT UCT UCB A new UCT search method using position evaluation function an UCT 1 2 1 UCT UCT UCB A new UCT search method using position evaluation function and its evaluation by Othello Shota Maehara, 1 Tsuyoshi Hashimoto 2 and Yasuyuki Kobayashi 1 The Monte Carlo tree search,

More information

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar Monte Carlo Tree Search and AlphaGo Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar Zero-Sum Games and AI A player s utility gain or loss is exactly balanced by the combined gain or loss of opponents:

More information

Score Bounded Monte-Carlo Tree Search

Score Bounded Monte-Carlo Tree Search Score Bounded Monte-Carlo Tree Search Tristan Cazenave and Abdallah Saffidine LAMSADE Université Paris-Dauphine Paris, France cazenave@lamsade.dauphine.fr Abdallah.Saffidine@gmail.com Abstract. Monte-Carlo

More information

A Study of UCT and its Enhancements in an Artificial Game

A Study of UCT and its Enhancements in an Artificial Game A Study of UCT and its Enhancements in an Artificial Game David Tom and Martin Müller Department of Computing Science, University of Alberta, Edmonton, Canada, T6G 2E8 {dtom, mmueller}@cs.ualberta.ca Abstract.

More information

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH Santiago Ontañón so367@drexel.edu Recall: Adversarial Search Idea: When there is only one agent in the world, we can solve problems using DFS, BFS, ID,

More information

CS-E4800 Artificial Intelligence

CS-E4800 Artificial Intelligence CS-E4800 Artificial Intelligence Jussi Rintanen Department of Computer Science Aalto University March 9, 2017 Difficulties in Rational Collective Behavior Individual utility in conflict with collective

More information

Bandit Algorithms Continued: UCB1

Bandit Algorithms Continued: UCB1 Bandit Algorithms Continued: UCB1 Noel Welsh 09 November 2010 Noel Welsh () Bandit Algorithms Continued: UCB1 09 November 2010 1 / 18 Annoucements Lab is busy Wednesday afternoon from 13:00 to 15:00 (Some)

More information

Monte Carlo Tree Search

Monte Carlo Tree Search Monte Carlo Tree Search 1 By the end, you will know Why we use Monte Carlo Search Trees The pros and cons of MCTS How it is applied to Super Mario Brothers and Alpha Go 2 Outline I. Pre-MCTS Algorithms

More information

GO for IT. Guillaume Chaslot. Mark Winands

GO for IT. Guillaume Chaslot. Mark Winands GO for IT Guillaume Chaslot Jaap van den Herik Mark Winands (UM) (UvT / Big Grid) (UM) Partnership for Advanced Computing in EUROPE Amsterdam, NH Hotel, Industrial Competitiveness: Europe goes HPC Krasnapolsky,

More information

Adversarial Reasoning: Sampling-Based Search with the UCT algorithm. Joint work with Raghuram Ramanujan and Ashish Sabharwal

Adversarial Reasoning: Sampling-Based Search with the UCT algorithm. Joint work with Raghuram Ramanujan and Ashish Sabharwal Adversarial Reasoning: Sampling-Based Search with the UCT algorithm Joint work with Raghuram Ramanujan and Ashish Sabharwal Upper Confidence bounds for Trees (UCT) n The UCT algorithm (Kocsis and Szepesvari,

More information

Game-playing: DeepBlue and AlphaGo

Game-playing: DeepBlue and AlphaGo Game-playing: DeepBlue and AlphaGo Brief history of gameplaying frontiers 1990s: Othello world champions refuse to play computers 1994: Chinook defeats Checkers world champion 1997: DeepBlue defeats world

More information

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage Richard Kelly and David Churchill Computer Science Faculty of Science Memorial University {richard.kelly, dchurchill}@mun.ca

More information

Monte Carlo Tree Search. Simon M. Lucas

Monte Carlo Tree Search. Simon M. Lucas Monte Carlo Tree Search Simon M. Lucas Outline MCTS: The Excitement! A tutorial: how it works Important heuristics: RAVE / AMAF Applications to video games and real-time control The Excitement Game playing

More information

CS 387: GAME AI BOARD GAMES

CS 387: GAME AI BOARD GAMES CS 387: GAME AI BOARD GAMES 5/28/2015 Instructor: Santiago Ontañón santi@cs.drexel.edu Class website: https://www.cs.drexel.edu/~santi/teaching/2015/cs387/intro.html Reminders Check BBVista site for the

More information

Combining Final Score with Winning Percentage by Sigmoid Function in Monte-Carlo Simulations

Combining Final Score with Winning Percentage by Sigmoid Function in Monte-Carlo Simulations Combining Final Score with Winning Percentage by Sigmoid Function in Monte-Carlo Simulations Kazutomo SHIBAHARA Yoshiyuki KOTANI Abstract Monte-Carlo method recently has produced good results in Go. Monte-Carlo

More information

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta Computer Go: from the Beginnings to AlphaGo Martin Müller, University of Alberta 2017 Outline of the Talk Game of Go Short history - Computer Go from the beginnings to AlphaGo The science behind AlphaGo

More information

Playing Othello Using Monte Carlo

Playing Othello Using Monte Carlo June 22, 2007 Abstract This paper deals with the construction of an AI player to play the game Othello. A lot of techniques are already known to let AI players play the game Othello. Some of these techniques

More information

43.1 Introduction. Foundations of Artificial Intelligence Introduction Monte-Carlo Methods Monte-Carlo Tree Search. 43.

43.1 Introduction. Foundations of Artificial Intelligence Introduction Monte-Carlo Methods Monte-Carlo Tree Search. 43. May 6, 20 3. : Introduction 3. : Introduction Malte Helmert University of Basel May 6, 20 3. Introduction 3.2 3.3 3. Summary May 6, 20 / 27 May 6, 20 2 / 27 Board Games: Overview 3. : Introduction Introduction

More information

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46.

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46. Foundations of Artificial Intelligence May 30, 2016 46. AlphaGo and Outlook Foundations of Artificial Intelligence 46. AlphaGo and Outlook Thomas Keller Universität Basel May 30, 2016 46.1 Introduction

More information

A Complex Systems Introduction to Go

A Complex Systems Introduction to Go A Complex Systems Introduction to Go Eric Jankowski CSAAW 10-22-2007 Background image by Juha Nieminen Wei Chi, Go, Baduk... Oldest board game in the world (maybe) Developed by Chinese monks Spread to

More information

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence Adversarial Search CS 486/686: Introduction to Artificial Intelligence 1 Introduction So far we have only been concerned with a single agent Today, we introduce an adversary! 2 Outline Games Minimax search

More information

Building Opening Books for 9 9 Go Without Relying on Human Go Expertise

Building Opening Books for 9 9 Go Without Relying on Human Go Expertise Journal of Computer Science 8 (10): 1594-1600, 2012 ISSN 1549-3636 2012 Science Publications Building Opening Books for 9 9 Go Without Relying on Human Go Expertise 1 Keh-Hsun Chen and 2 Peigang Zhang

More information

Monte-Carlo Tree Search and Minimax Hybrids with Heuristic Evaluation Functions

Monte-Carlo Tree Search and Minimax Hybrids with Heuristic Evaluation Functions Monte-Carlo Tree Search and Minimax Hybrids with Heuristic Evaluation Functions Hendrik Baier and Mark H.M. Winands Games and AI Group, Department of Knowledge Engineering Faculty of Humanities and Sciences,

More information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April AlphaZero

TTIC 31230, Fundamentals of Deep Learning David McAllester, April AlphaZero TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 AlphaZero 1 AlphaGo Fan (October 2015) AlphaGo Defeats Fan Hui, European Go Champion. 2 AlphaGo Lee (March 2016) 3 AlphaGo Zero vs.

More information

Monte-Carlo Tree Search for the Simultaneous Move Game Tron

Monte-Carlo Tree Search for the Simultaneous Move Game Tron Monte-Carlo Tree Search for the Simultaneous Move Game Tron N.G.P. Den Teuling June 27, 2011 Abstract Monte-Carlo Tree Search (MCTS) has been successfully applied to many games, particularly in Go. In

More information

Adding expert knowledge and exploration in Monte-Carlo Tree Search

Adding expert knowledge and exploration in Monte-Carlo Tree Search Adding expert knowledge and exploration in Monte-Carlo Tree Search Guillaume Chaslot, Christophe Fiter, Jean-Baptiste Hoock, Arpad Rimmel, Olivier Teytaud To cite this version: Guillaume Chaslot, Christophe

More information

Available online at ScienceDirect. Procedia Computer Science 62 (2015 ) 31 38

Available online at  ScienceDirect. Procedia Computer Science 62 (2015 ) 31 38 Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 62 (2015 ) 31 38 The 2015 International Conference on Soft Computing and Software Engineering (SCSE 2015) Analysis of a

More information

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask Set 4: Game-Playing ICS 271 Fall 2017 Kalev Kask Overview Computer programs that play 2-player games game-playing as search with the complication of an opponent General principles of game-playing and search

More information

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence Adversarial Search CS 486/686: Introduction to Artificial Intelligence 1 AccessAbility Services Volunteer Notetaker Required Interested? Complete an online application using your WATIAM: https://york.accessiblelearning.com/uwaterloo/

More information

Nested Monte-Carlo Search

Nested Monte-Carlo Search Nested Monte-Carlo Search Tristan Cazenave LAMSADE Université Paris-Dauphine Paris, France cazenave@lamsade.dauphine.fr Abstract Many problems have a huge state space and no good heuristic to order moves

More information

Comparing UCT versus CFR in Simultaneous Games

Comparing UCT versus CFR in Simultaneous Games Comparing UCT versus CFR in Simultaneous Games Mohammad Shafiei Nathan Sturtevant Jonathan Schaeffer Computing Science Department University of Alberta {shafieik,nathanst,jonathan}@cs.ualberta.ca Abstract

More information

Creating a Havannah Playing Agent

Creating a Havannah Playing Agent Creating a Havannah Playing Agent B. Joosten August 27, 2009 Abstract This paper delves into the complexities of Havannah, which is a 2-person zero-sum perfectinformation board game. After determining

More information

More on games (Ch )

More on games (Ch ) More on games (Ch. 5.4-5.6) Alpha-beta pruning Previously on CSci 4511... We talked about how to modify the minimax algorithm to prune only bad searches (i.e. alpha-beta pruning) This rule of checking

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence 175 (2011) 1856 1875 Contents lists available at ScienceDirect Artificial Intelligence www.elsevier.com/locate/artint Monte-Carlo tree search and rapid action value estimation in

More information

Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku

Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku Implementation of Upper Confidence Bounds for Trees (UCT) on Gomoku Guanlin Zhou (gz2250), Nan Yu (ny2263), Yanqing Dai (yd2369), Yingtao Zhong (yz3276) 1. Introduction: Reinforcement Learning for Gomoku

More information

Thesis : Improvements and Evaluation of the Monte-Carlo Tree Search Algorithm. Arpad Rimmel

Thesis : Improvements and Evaluation of the Monte-Carlo Tree Search Algorithm. Arpad Rimmel Thesis : Improvements and Evaluation of the Monte-Carlo Tree Search Algorithm Arpad Rimmel 15/12/2009 ii Contents Acknowledgements Citation ii ii 1 Introduction 1 1.1 Motivations............................

More information

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here: Adversarial Search 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: q Slides for this lecture are here: http://www.public.asu.edu/~yzhan442/teaching/cse471/lectures/adversarial.pdf Slides are largely based

More information

Game Playing State-of-the-Art

Game Playing State-of-the-Art Adversarial Search [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] Game Playing State-of-the-Art

More information

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1 Adversarial Search Read AIMA Chapter 5.2-5.5 CIS 421/521 - Intro to AI 1 Adversarial Search Instructors: Dan Klein and Pieter Abbeel University of California, Berkeley [These slides were created by Dan

More information

The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games

The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games Proceedings of the Ninth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games Santiago

More information

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search CS 188: Artificial Intelligence Adversarial Search Instructor: Marco Alvarez University of Rhode Island (These slides were created/modified by Dan Klein, Pieter Abbeel, Anca Dragan for CS188 at UC Berkeley)

More information

MONTE-CARLO TWIXT. Janik Steinhauer. Master Thesis 10-08

MONTE-CARLO TWIXT. Janik Steinhauer. Master Thesis 10-08 MONTE-CARLO TWIXT Janik Steinhauer Master Thesis 10-08 Thesis submitted in partial fulfilment of the requirements for the degree of Master of Science of Artificial Intelligence at the Faculty of Humanities

More information

CS 5522: Artificial Intelligence II

CS 5522: Artificial Intelligence II CS 5522: Artificial Intelligence II Adversarial Search Instructor: Alan Ritter Ohio State University [These slides were adapted from CS188 Intro to AI at UC Berkeley. All materials available at http://ai.berkeley.edu.]

More information

CSC321 Lecture 23: Go

CSC321 Lecture 23: Go CSC321 Lecture 23: Go Roger Grosse Roger Grosse CSC321 Lecture 23: Go 1 / 21 Final Exam Friday, April 20, 9am-noon Last names A Y: Clara Benson Building (BN) 2N Last names Z: Clara Benson Building (BN)

More information

Probability of Potential Model Pruning in Monte-Carlo Go

Probability of Potential Model Pruning in Monte-Carlo Go Available online at www.sciencedirect.com Procedia Computer Science 6 (211) 237 242 Complex Adaptive Systems, Volume 1 Cihan H. Dagli, Editor in Chief Conference Organized by Missouri University of Science

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Adversarial Search Instructor: Stuart Russell University of California, Berkeley Game Playing State-of-the-Art Checkers: 1950: First computer player. 1959: Samuel s self-taught

More information

A Multi Armed Bandit Formulation of Cognitive Spectrum Access

A Multi Armed Bandit Formulation of Cognitive Spectrum Access 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

More on games (Ch )

More on games (Ch ) More on games (Ch. 5.4-5.6) Announcements Midterm next Tuesday: covers weeks 1-4 (Chapters 1-4) Take the full class period Open book/notes (can use ebook) ^^ No programing/code, internet searches or friends

More information

Game-Playing & Adversarial Search

Game-Playing & Adversarial Search Game-Playing & Adversarial Search This lecture topic: Game-Playing & Adversarial Search (two lectures) Chapter 5.1-5.5 Next lecture topic: Constraint Satisfaction Problems (two lectures) Chapter 6.1-6.4,

More information

CS221 Project Final Report Gomoku Game Agent

CS221 Project Final Report Gomoku Game Agent CS221 Project Final Report Gomoku Game Agent Qiao Tan qtan@stanford.edu Xiaoti Hu xiaotihu@stanford.edu 1 Introduction Gomoku, also know as five-in-a-row, is a strategy board game which is traditionally

More information

CS 771 Artificial Intelligence. Adversarial Search

CS 771 Artificial Intelligence. Adversarial Search CS 771 Artificial Intelligence Adversarial Search Typical assumptions Two agents whose actions alternate Utility values for each agent are the opposite of the other This creates the adversarial situation

More information

ARTIFICIAL INTELLIGENCE (CS 370D)

ARTIFICIAL INTELLIGENCE (CS 370D) Princess Nora University Faculty of Computer & Information Systems ARTIFICIAL INTELLIGENCE (CS 370D) (CHAPTER-5) ADVERSARIAL SEARCH ADVERSARIAL SEARCH Optimal decisions Min algorithm α-β pruning Imperfect,

More information

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Tom Pepels June 19, 2012 Abstract In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man.

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Adversarial Search Instructors: David Suter and Qince Li Course Delivered @ Harbin Institute of Technology [Many slides adapted from those created by Dan Klein and Pieter Abbeel

More information

Igo Math Natural and Artificial Intelligence

Igo Math Natural and Artificial Intelligence Attila Egri-Nagy Igo Math Natural and Artificial Intelligence and the Game of Go V 2 0 1 9.0 2.1 4 These preliminary notes are being written for the MAT230 course at Akita International University in Japan.

More information

Programming Project 1: Pacman (Due )

Programming Project 1: Pacman (Due ) Programming Project 1: Pacman (Due 8.2.18) Registration to the exams 521495A: Artificial Intelligence Adversarial Search (Min-Max) Lectured by Abdenour Hadid Adjunct Professor, CMVS, University of Oulu

More information

Old-fashioned Computer Go vs Monte-Carlo Go

Old-fashioned Computer Go vs Monte-Carlo Go Old-fashioned Computer Go vs Monte-Carlo Go Bruno Bouzy Paris Descartes University, France CIG07 Tutorial April 1 st 2007 Honolulu, Hawaii 1 Outline Computer Go (CG) overview Rules of the game History

More information

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1 Announcements Homework 1 Due tonight at 11:59pm Project 1 Electronic HW1 Written HW1 Due Friday 2/8 at 4:00pm CS 188: Artificial Intelligence Adversarial Search and Game Trees Instructors: Sergey Levine

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Adversarial Search Prof. Scott Niekum The University of Texas at Austin [These slides are based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

Playout Search for Monte-Carlo Tree Search in Multi-Player Games

Playout Search for Monte-Carlo Tree Search in Multi-Player Games Playout Search for Monte-Carlo Tree Search in Multi-Player Games J. (Pim) A.M. Nijssen and Mark H.M. Winands Games and AI Group, Department of Knowledge Engineering, Faculty of Humanities and Sciences,

More information

Computing Science (CMPUT) 496

Computing Science (CMPUT) 496 Computing Science (CMPUT) 496 Search, Knowledge, and Simulations Martin Müller Department of Computing Science University of Alberta mmueller@ualberta.ca Winter 2017 Part IV Knowledge 496 Today - Mar 9

More information

Move Prediction in Go Modelling Feature Interactions Using Latent Factors

Move Prediction in Go Modelling Feature Interactions Using Latent Factors Move Prediction in Go Modelling Feature Interactions Using Latent Factors Martin Wistuba and Lars Schmidt-Thieme University of Hildesheim Information Systems & Machine Learning Lab {wistuba, schmidt-thieme}@ismll.de

More information

Multiple Tree for Partially Observable Monte-Carlo Tree Search

Multiple Tree for Partially Observable Monte-Carlo Tree Search Multiple Tree for Partially Observable Monte-Carlo Tree Search David Auger To cite this version: David Auger. Multiple Tree for Partially Observable Monte-Carlo Tree Search. 2011. HAL

More information

CS229 Project: Building an Intelligent Agent to play 9x9 Go

CS229 Project: Building an Intelligent Agent to play 9x9 Go CS229 Project: Building an Intelligent Agent to play 9x9 Go Shawn Hu Abstract We build an AI to autonomously play the board game of Go at a low amateur level. Our AI uses the UCT variation of Monte Carlo

More information

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1 Unit-III Chap-II Adversarial Search Created by: Ashish Shah 1 Alpha beta Pruning In case of standard ALPHA BETA PRUNING minimax tree, it returns the same move as minimax would, but prunes away branches

More information

Artificial Intelligence. Minimax and alpha-beta pruning

Artificial Intelligence. Minimax and alpha-beta pruning Artificial Intelligence Minimax and alpha-beta pruning In which we examine the problems that arise when we try to plan ahead to get the best result in a world that includes a hostile agent (other agent

More information

Monte-Carlo Tree Search and Minimax Hybrids

Monte-Carlo Tree Search and Minimax Hybrids Monte-Carlo Tree Search and Minimax Hybrids Hendrik Baier and Mark H.M. Winands Games and AI Group, Department of Knowledge Engineering Faculty of Humanities and Sciences, Maastricht University Maastricht,

More information

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I Adversarial Search and Game- Playing C H A P T E R 6 C M P T 3 1 0 : S P R I N G 2 0 1 1 H A S S A N K H O S R A V I Adversarial Search Examine the problems that arise when we try to plan ahead in a world

More information

UCT for Tactical Assault Planning in Real-Time Strategy Games

UCT for Tactical Assault Planning in Real-Time Strategy Games Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence (IJCAI-09) UCT for Tactical Assault Planning in Real-Time Strategy Games Radha-Krishna Balla and Alan Fern School

More information

Computing Elo Ratings of Move Patterns in the Game of Go

Computing Elo Ratings of Move Patterns in the Game of Go Computing Elo Ratings of Move Patterns in the Game of Go Rémi Coulom To cite this veion: Rémi Coulom Computing Elo Ratings of Move Patterns in the Game of Go van den Herik, H Jaap and Mark Winands and

More information

Programming an Othello AI Michael An (man4), Evan Liang (liange)

Programming an Othello AI Michael An (man4), Evan Liang (liange) Programming an Othello AI Michael An (man4), Evan Liang (liange) 1 Introduction Othello is a two player board game played on an 8 8 grid. Players take turns placing stones with their assigned color (black

More information

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1

Lecture 14. Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1 Lecture 14 Questions? Friday, February 10 CS 430 Artificial Intelligence - Lecture 14 1 Outline Chapter 5 - Adversarial Search Alpha-Beta Pruning Imperfect Real-Time Decisions Stochastic Games Friday,

More information

Artificial Intelligence for Go. Kristen Ying Advisors: Dr. Maxim Likhachev & Dr. Norm Badler

Artificial Intelligence for Go. Kristen Ying Advisors: Dr. Maxim Likhachev & Dr. Norm Badler Artificial Intelligence for Go Kristen Ying Advisors: Dr. Maxim Likhachev & Dr. Norm Badler 1 Introduction 2 Algorithms 3 Implementation 4 Results 1 Introduction 2 Algorithms 3 Implementation 4 Results

More information

TRIAL-BASED HEURISTIC TREE SEARCH FOR FINITE HORIZON MDPS. Thomas Keller and Malte Helmert Presented by: Ryan Berryhill

TRIAL-BASED HEURISTIC TREE SEARCH FOR FINITE HORIZON MDPS. Thomas Keller and Malte Helmert Presented by: Ryan Berryhill TRIAL-BASED HEURISTIC TREE SEARCH FOR FINITE HORIZON MDPS Thomas Keller and Malte Helmert Presented by: Ryan Berryhill Outline Motivation Background THTS framework THTS algorithms Results Motivation Advances

More information

game tree complete all possible moves

game tree complete all possible moves Game Trees Game Tree A game tree is a tree the nodes of which are positions in a game and edges are moves. The complete game tree for a game is the game tree starting at the initial position and containing

More information

Artificial Intelligence Search III

Artificial Intelligence Search III Artificial Intelligence Search III Lecture 5 Content: Search III Quick Review on Lecture 4 Why Study Games? Game Playing as Search Special Characteristics of Game Playing Search Ingredients of 2-Person

More information

Generalized Rapid Action Value Estimation

Generalized Rapid Action Value Estimation Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Generalized Rapid Action Value Estimation Tristan Cazenave LAMSADE - Universite Paris-Dauphine Paris,

More information

Improving Monte Carlo Tree Search Policies in StarCraft via Probabilistic Models Learned from Replay Data

Improving Monte Carlo Tree Search Policies in StarCraft via Probabilistic Models Learned from Replay Data Proceedings, The Twelfth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-16) Improving Monte Carlo Tree Search Policies in StarCraft via Probabilistic Models Learned

More information

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game Outline Game Playing ECE457 Applied Artificial Intelligence Fall 2007 Lecture #5 Types of games Playing a perfect game Minimax search Alpha-beta pruning Playing an imperfect game Real-time Imperfect information

More information

Monte-Carlo Tree Search Enhancements for Havannah

Monte-Carlo Tree Search Enhancements for Havannah Monte-Carlo Tree Search Enhancements for Havannah Jan A. Stankiewicz, Mark H.M. Winands, and Jos W.H.M. Uiterwijk Department of Knowledge Engineering, Maastricht University j.stankiewicz@student.maastrichtuniversity.nl,

More information

Improving MCTS and Neural Network Communication in Computer Go

Improving MCTS and Neural Network Communication in Computer Go Improving MCTS and Neural Network Communication in Computer Go Joshua Keller Oscar Perez Worcester Polytechnic Institute a Major Qualifying Project Report submitted to the faculty of Worcester Polytechnic

More information

On the Huge Benefit of Decisive Moves in Monte-Carlo Tree Search Algorithms

On the Huge Benefit of Decisive Moves in Monte-Carlo Tree Search Algorithms On the Huge Benefit of Decisive Moves in Monte-Carlo Tree Search Algorithms Fabien Teytaud, Olivier Teytaud To cite this version: Fabien Teytaud, Olivier Teytaud. On the Huge Benefit of Decisive Moves

More information

Theory and Practice of Artificial Intelligence

Theory and Practice of Artificial Intelligence Theory and Practice of Artificial Intelligence Games Daniel Polani School of Computer Science University of Hertfordshire March 9, 2017 All rights reserved. Permission is granted to copy and distribute

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 42. Board Games: Alpha-Beta Search Malte Helmert University of Basel May 16, 2018 Board Games: Overview chapter overview: 40. Introduction and State of the Art 41.

More information

Upper Confidence Trees with Short Term Partial Information

Upper Confidence Trees with Short Term Partial Information Author manuscript, published in "EvoGames 2011 6624 (2011) 153-162" DOI : 10.1007/978-3-642-20525-5 Upper Confidence Trees with Short Term Partial Information Olivier Teytaud 1 and Sébastien Flory 2 1

More information

Adversarial Search (I)

Adversarial Search (I) Adversarial Search (I) Instructor: Tsung-Che Chiang tcchiang@ieee.org Department of Computer Science and Information Engineering National Taiwan Normal University Artificial Intelligence, Spring, 2010

More information

Adversarial Search (I)

Adversarial Search (I) Adversarial Search (I) Instructor: Tsung-Che Chiang tcchiang@ieee.org Department of Computer Science and Information Engineering National Taiwan Normal University Artificial Intelligence, Spring, 2010

More information

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Enhancements for Monte-Carlo Tree Search in Ms Pac-Man Tom Pepels Mark H.M. Winands Abstract In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man.

More information

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville Computer Science and Software Engineering University of Wisconsin - Platteville 4. Game Play CS 3030 Lecture Notes Yan Shi UW-Platteville Read: Textbook Chapter 6 What kind of games? 2-player games Zero-sum

More information

Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN

Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN Weijie Chen Fall 2017 Weijie Chen Page 1 of 7 1. INTRODUCTION Game TEN The traditional game Tic-Tac-Toe enjoys people s favor. Moreover,

More information

2 person perfect information

2 person perfect information Why Study Games? Games offer: Intellectual Engagement Abstraction Representability Performance Measure Not all games are suitable for AI research. We will restrict ourselves to 2 person perfect information

More information

Adversarial Search 1

Adversarial Search 1 Adversarial Search 1 Adversarial Search The ghosts trying to make pacman loose Can not come up with a giant program that plans to the end, because of the ghosts and their actions Goal: Eat lots of dots

More information

4. Games and search. Lecture Artificial Intelligence (4ov / 8op)

4. Games and search. Lecture Artificial Intelligence (4ov / 8op) 4. Games and search 4.1 Search problems State space search find a (shortest) path from the initial state to the goal state. Constraint satisfaction find a value assignment to a set of variables so that

More information

Game Algorithms Go and MCTS. Petr Baudiš, 2011

Game Algorithms Go and MCTS. Petr Baudiš, 2011 Game Algorithms Go and MCTS Petr Baudiš, 2011 Outline What is Go and why is it interesting Possible approaches to solving Go Monte Carlo and UCT Enhancing the MC simulations Enhancing the tree search Automatic

More information

An AI for Dominion Based on Monte-Carlo Methods

An AI for Dominion Based on Monte-Carlo Methods An AI for Dominion Based on Monte-Carlo Methods by Jon Vegard Jansen and Robin Tollisen Supervisors: Morten Goodwin, Associate Professor, Ph.D Sondre Glimsdal, Ph.D Fellow June 2, 2014 Abstract To the

More information