Advanced Game AI. Level 6 Search in Games. Prof Alexiei Dingli

Size: px

Start display at page:

Download "Advanced Game AI. Level 6 Search in Games. Prof Alexiei Dingli"

Melina Chase
6 years ago
Views:

1 Advanced Game AI Level 6 Search in Games Prof Alexiei Dingli

2 MCTS?

4 MCTS Based upon Selec=on Expansion Simula=on Back propaga=on Enhancements

6 The Mul=- Armed Bandit Problem At each step pull one arm Noisy/random reward signal In order to: Find the best arm Minimise regret Maximise expected return

7 Which Arm to Pull? Flat Monte Carlo Pull each arm set number of =mes Give them equal probability Assume they are not interconnected

8 Which Arm to Pull? ε- Greedy P(1- ε) Best arm so far P(ε) Random arm

9 Which Arm to Pull? UCB1 Choose arm j so as to maximise: Mean so far Upper bound on variance

10 Game Decisions Move A Current posi=on Move B Mul=- Armed Bandit Move C Posi=on auer move A Arm A Simula=on Result Loss (0) Posi=on auer move B Arm B Simula=on Result Win (+1) Posi=on auer move C Arm C Simula=on Result Loss (0)

11 Monte Carlo Tree Search (MCTS) 1. Build a tree 2. Recursively treat each node as a mul=- armed bandit

12 AZrac=ve Features Easy to implement Any=me stop whenever you like Difficulty scaling by simply adjus=ng CPU =me Needs only game rules: Move genera=on Terminal state evalua=on No need for a heuris=c func=on But can be enhanced with domain knowledge

13 MCTS: the main idea Tree policy: choose which node to expand (not necessarily a leaf) Default policy: random play out un=l end of game

14 The Algorithm

15 The Tree Policy

16 The Tree Expansion

17 The Best Child

18 The Default Policy

19 Backup

21 Enhancements include Selec=on/Expansion All- moves- as- first (AMAF / RAVE) First Play Urgency Machine learning Simula=on Move- Average Sampling Technique (MAST) Last Good Reply (LGR) Paralleliza=on Domain Knowledge Heuris=c value func=ons PaZerns

22 How to handle uncertain and incomplete informa=on?

23 How to handle uncertain and incomplete informa=on? Informa=on set: Actual state: {, Observa=on:,,...}

24 Effects of Uncertainty and Hidden Informa=on on the Game Tree 4 possible plays by me 48 possible card draws C 3 = different opponent plays

25 Reduced Branching Through Determiniza=on... 4 possible plays by me 1 possible card draw 4 C 1 = 4 different opponent plays

26 Determiniza=on Sample states from the informa=on set Analyse the individual perfect informa=on games Combine the results at the end Successes Bridge (Ginsberg), Scrabble (Sheppard) Klondike solitaire (Bjarnason, Fern and Tadepalli) Probabilis=c planning (Yoon, Fern and Givan) Problems Never tries to gather or hide informa=on Suffers from strategy fusion and non- locality (Frank and Basin)

27 Chea=ng Easiest approach to AI for games with imperfect informa=on: cheat and look at the hidden informa=on and outcomes of future chance events This gives a determinis=c game of perfect informa=on, which can be searched with standard techniques (minimax, UCT, ) Not chea=ng ouen results in bezer gameplay rewards player for gathering and hiding informa=on

28 Informa=on Set MCTS

29 MCTS for Real- Time Decision- Making Limited roll- out budget Heuris=c knowledge becomes important Ac=on space is fine- grained Take macro- ac)ons otherwise planning will be very short- term May be no terminal node in sight Use a heuris=c Tune simula=on depth

30 Benefits of MCTS Aheuris=c Asymmetric Any=me Elegant

31 Drawbacks of MCTS Playing Strength Speed

33 MCTS in Rome Total War 2 Random By Design Unpredictability is some=mes welcome from a game design perspec=ve for replayability The player doesn't always want to face the same army composi=ons Avoid Bad Decisions The brute- force stochas=c searches allows the AI to avoid mistakes more effec=vely since many different op=ons are tried ComputaBon Budget MCTS is capable of making great use of computa=on to find a balance between "exploring" new solu=ons and "exploi=ng" its known best solu=ons

34 Premise The Campaign AI of TOTAL WAR: ROME II is built around the observa=on that the problem is unsolvable if all inter- dependencies are considered It involves hundreds of regions, units, dozens of buildings and coordina=ng the diplomacy, technologies, skills, legacies, edicts of each fac=on...

35 The 3 main modules Task GeneraBon High- level goals, each with required resources, are created as the collec=ve result of mul=ple simple "generators Resource AllocaBon Matching resources (few) to tasks (many) taking into account diplomacy, strategy and the previous alloca=on (using MCTS) Resource CoordinaBon An MCTS- based planner determines the best set of ac=ons given resources and their ac=ons

36 Op=miza=ons

37 Domain Knowledge

38 Conclusions MCTS: exci=ng area of research Many impressive achievements already With many more to come Difficulty scaling by simply adjus=ng CPU =me Future applica=ons: Board/card game AI especially hidden info Video game strategy Agent decision- making Op=miza=on

40 Exercise How will you use MCTS to solve this problem? G

41 Exercise Goal: Get the player avatar to the goal while avoiding the ghost Player Princess: P Ghost: G Ac=ons UP = 0 RIGHT = 1 DOWN = 2 LEFT = 3 Reward: +1 if goal is reached +0 if goal is not reached

42 Ques=ons?

CS 387: GAME AI BOARD GAMES

CS 387: GAME AI BOARD GAMES 5/28/2015 Instructor: Santiago Ontañón santi@cs.drexel.edu Class website: https://www.cs.drexel.edu/~santi/teaching/2015/cs387/intro.html Reminders Check BBVista site for the