Learning Artificial Intelligence in Large-Scale Video Games

Similar documents
ARTIFICIAL INTELLIGENCE (CS 370D)

IMPROVING TOWER DEFENSE GAME AI (DIFFERENTIAL EVOLUTION VS EVOLUTIONARY PROGRAMMING) CHEAH KEEI YUAN

CS325 Artificial Intelligence Ch. 5, Games!

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Game Artificial Intelligence ( CS 4731/7632 )

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi

COMP219: Artificial Intelligence. Lecture 13: Game Playing

CS188: Artificial Intelligence, Fall 2011 Written 2: Games and MDP s

Artificial Intelligence. Cameron Jett, William Kentris, Arthur Mo, Juan Roman

CS 229 Final Project: Using Reinforcement Learning to Play Othello

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Computing Science (CMPUT) 496

game tree complete all possible moves

CS 540: Introduction to Artificial Intelligence

ConvNets and Forward Modeling for StarCraft AI

More on games (Ch )

Playing Games. Henry Z. Lo. June 23, We consider writing AI to play games with the following properties:

The game of Bridge: a challenge for ILP

Simple Poker Game Design, Simulation, and Probability

Imperfect Information. Lecture 10: Imperfect Information. What is the size of a game with ii? Example Tree

CMSC 671 Project Report- Google AI Challenge: Planet Wars

Artificial Intelligence

Today. Nondeterministic games: backgammon. Algorithm for nondeterministic games. Nondeterministic games in general. See Russell and Norvig, chapter 6

Artificial Intelligence. 4. Game Playing. Prof. Bojana Dalbelo Bašić Assoc. Prof. Jan Šnajder

IMGD 1001: Programming Practices; Artificial Intelligence

Game-playing AIs: Games and Adversarial Search I AIMA

Five-In-Row with Local Evaluation and Beam Search

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

IMGD 1001: Programming Practices; Artificial Intelligence

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Game-playing: DeepBlue and AlphaGo

Reactive Planning for Micromanagement in RTS Games

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage

Federico Forti, Erdi Izgi, Varalika Rathore, Francesco Forti

Artificial Intelligence

CS 4700: Foundations of Artificial Intelligence

Artificial Intelligence. Minimax and alpha-beta pruning

Dynamic Throttle Estimation by Machine Learning from Professionals

Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER April 6, 2017

Case-Based Goal Formulation

A Bandit Approach for Tree Search

Shuyi Zhang. Master of Science. Department of Computing Science. University of Alberta. c Shuyi Zhang, 2017

More on games (Ch )

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( )

Monte Carlo Tree Search

Artificial Intelligence Lecture 3

CS510 \ Lecture Ariel Stolerman

AI Approaches to Ultimate Tic-Tac-Toe

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

Gameplay as On-Line Mediation Search

Creating a New Angry Birds Competition Track

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar

Online Interactive Neuro-evolution

2 person perfect information

Principles of Computer Game Design and Implementation. Lecture 20

Game Playing for a Variant of Mancala Board Game (Pallanguzhi)

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab. 김강일

Documentation and Discussion

CS 4700: Artificial Intelligence

COMPLEMENTARY COMPANION BEHAVIOR IN VIDEO GAMES. A Thesis. presented to. the Faculty of California Polytechnic State University, San Luis Obispo

Case-based Action Planning in a First Person Scenario Game

A Particle Model for State Estimation in Real-Time Strategy Games

the question of whether computers can think is like the question of whether submarines can swim -- Dijkstra

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

Integrating Learning in a Multi-Scale Agent

Replay-based Strategy Prediction and Build Order Adaptation for StarCraft AI Bots

For slightly more detailed instructions on how to play, visit:

Adversary Search. Ref: Chapter 5

Discussion of Emergent Strategy

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46.

Artificial Intelligence Adversarial Search

CS221 Project Final Report Gomoku Game Agent

V. Adamchik Data Structures. Game Trees. Lecture 1. Apr. 05, Plan: 1. Introduction. 2. Game of NIM. 3. Minimax

Monte Carlo Tree Search. Simon M. Lucas

the question of whether computers can think is like the question of whether submarines can swim -- Dijkstra

The Game-Theoretic Approach to Machine Learning and Adaptation

Intelligent Agents & Search Problem Formulation. AIMA, Chapters 2,

Artificial Intelligence Paper Presentation

Outline. Introduction to AI. Artificial Intelligence. What is an AI? What is an AI? Agents Environments

Selected Game Examples

CS221 Final Project Report Learn to Play Texas hold em

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

Decision Making in Multiplayer Environments Application in Backgammon Variants

Bootstrapping from Game Tree Search

CS 680: GAME AI INTRODUCTION TO GAME AI. 1/9/2012 Santiago Ontañón

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol

UMBC 671 Midterm Exam 19 October 2009

Adversarial Search (Game Playing)

Virtual Global Search: Application to 9x9 Go

CSE 258 Winter 2017 Assigment 2 Skill Rating Prediction on Online Video Game

CS 680: GAME AI WEEK 4: DECISION MAKING IN RTS GAMES

CS-E4800 Artificial Intelligence

the gamedesigninitiative at cornell university Lecture 6 Uncertainty & Risk

Server-side Early Detection Method for Detecting Abnormal Players of StarCraft

Midterm Examination. CSCI 561: Artificial Intelligence

Training a Neural Network for Checkers

Transcription:

Learning Artificial Intelligence in Large-Scale Video Games A First Case Study with Hearthstone: Heroes of WarCraft Master Thesis Submitted for the Degree of MSc in Computer Science & Engineering Author Prof. Damien Ernst Supervisor Academic year 2014 2015

Video Games, Then and Now Then, the problems to solve were representable easily Example: Pac-Man Fully observable maze Limited number of agents Small, well-defined action space Now, the problems feature numerous variables Example: StarCraft Vast, partially observable map Complex state representation Prohibitively large action space, difficult to represent 2 / 23

Video Games, Then and Now Games continue to feature richer environments... 3 / 23

Video Games, Then and Now Games continue to feature richer environments...... but designing robust AIs becomes increasingly difficult! 3 / 23

Video Games, Then and Now Games continue to feature richer environments...... but designing robust AIs becomes increasingly difficult! Making AI learn instead of being taught: a better solution? 3 / 23

Objectives of this Thesis 1. Design & study of a theory for creating autonomous agents in the case of large-scale video games Study applied to the game Hearthstone: Heroes of Warcraft 2. Develop a modular and extensible clone of the game Hearthstone: HoW Makes us able to test the theory practically 4 / 23

Problem Statement 1. State Vectors World vector w W contains all information available in a given state Everything is not relevant If σ( ) is the projection operator such that w W, s = σ(w) is the relevant part of w for the targeted application, we define the set of all state vectors. S := {σ(w) w W} 5 / 23

Problem Statement 2. Action Vectors Available actions have unknown consequences Let A be the set of available actions in the game Let A s be the set of actions that can be taken in state s S 6 / 23

Problem Statement 3. State Scoring Function There should exist a bounded function ρ : S R having the following properties: ρ(s) < 0 if, from s info, the player is considered as likely to lose, ρ(s) > 0 if, from s info, the player is considered as likely to win, ρ(s) = 0 otherwise. Based on expert knowledge 7 / 23

Problem Statement 4. Problem Formalization Games follow discrete-time dynamics: τ : S A S (s t, a) s t+1 for a A st, t = 0, 1,... Let R ρ be an objective function whose analytical expression depends on ρ: R ρ : S A R (s, a) R ρ (s, a) for a A s. 8 / 23

Problem Statement 4. Problem Formalization R ρ (s, a) is considered uncomputable from state s Difficulty to simulate side-trajectories in large-scale games Find an action selection policy h such that h : S A s argmax a A s R ρ (s, a). 9 / 23

Getting Intuition on Actions from State Scoring Differences Our analytical expression for R ρ : R ρ (s, a) := ρ(τ(s, a)) ρ(s). Report erratum In Figure 3.2, the classifier is asked to predict the sign of R ρ, and not ρ. 10 / 23

Nora: Design & Results 11 / 23

Action Selection Process Report erratum In Figure 4.5, the classifiers are asked to predict the sign of R ρ, and not ρ. 12 / 23

Caveats Memory usage Approx. 14GB is needed to keep the models in RAM Fix: tree pruning and parameters tuning Play actions classifier underestimates the value of some actions Random target selection is assumed after playing an action that needs a target Fix: Two-step training 13 / 23

Results Matchup Win rate Nora vs. Random 93% Nora vs. Scripted 10%... But compared to the random player performance... 14 / 23

Results Matchup Win rate Nora vs. Random 93% Nora vs. Scripted 10% Random vs. Scripted < 1%! Nora applies some strategy the random player does not Qualitatively, this translates into a board control behavior Never target her allies with harmful actions, even though it is allowed Accurate understanding of the Fireblast special power 15 / 23

Conclusion Any questions? Thank you for your attention. 16 / 23

Appendix Why Extremely Randomized Trees? Ensemble methods can often surpass single classifiers From a statistical, computational and representational point of view Decision trees are particularly suited for ensemble methods Low computational cost of the standard tree growing algorithm But careful about memory... Random trees suited for problems with many features Each node can be built with a random subset of features Feature importances Useful for designing the projection operator σ : W S 17 / 23

Appendix Computation of the ExtraTrees Classifier Confidence It is the predicted positive class probability of the classifier Computed as the mean predicted positive class probability of the trees in the forest Predicted positive class probability of a sample s in a tree: #{s leaf in which s falls s labelled positive} #{s leaf in which s falls} 18 / 23

Appendix Basics of Hearthstone: Heroes of WarCraft Stylized combat game Cards are obtained by drawing from your deck Your hand is hidden to your opponent Goal: Make the enemy player s hero health go to zero. 19 / 23

Appendix Basics of Hearthstone: Heroes of WarCraft Cards are played using a resource: the Mana Minions that join the battle Spells Rules are objects in the game Game based on creating new and breaking/modifying rules 20 / 23

Learning Artificial Intelligence in Large-Scale Video Games Appendix Basics of Hearthstone: Heroes of WarCraft 21 / 23

Appendix Basics of Hearthstone: Heroes of WarCraft Things Might Get Tricky...! 22 / 23

Appendix The simulator Hearthstone: HoW simulator created with C++/Qt 5 Modular, extensible Cards are loaded from an external file Quite a challenge! Definition of JARS for describing cards in a user-friendly way Just Another Representation Syntax Context-aware, JSON-based language Makes it easy to create and edit cards without coding 23 / 23