Integrating Learning in a Multi-Scale Agent

Similar documents
Case-Based Goal Formulation

Case-Based Goal Formulation

Applying Goal-Driven Autonomy to StarCraft

A Particle Model for State Estimation in Real-Time Strategy Games

Using Automated Replay Annotation for Case-Based Planning in Games

Capturing and Adapting Traces for Character Control in Computer Role Playing Games

Towards Cognition-level Goal Reasoning for Playing Real-Time Strategy Games

Reactive Planning Idioms for Multi-Scale Game AI

Reactive Planning for Micromanagement in RTS Games

Case-based Action Planning in a First Person Scenario Game

Goal-Driven Autonomy with Semantically-annotated Hierarchical Cases

CPS331 Lecture: Intelligent Agents last revised July 25, 2018

Outline. Agents and environments Rationality PEAS (Performance measure, Environment, Actuators, Sensors) Environment types Agent types

Sequential Pattern Mining in StarCraft:Brood War for Short and Long-term Goals

Sequential Pattern Mining in StarCraft: Brood War for Short and Long-Term Goals

A Bayesian Model for Plan Recognition in RTS Games applied to StarCraft

CPS331 Lecture: Agents and Robots last revised November 18, 2016

An Improved Dataset and Extraction Process for Starcraft AI

Outline. Introduction to AI. Artificial Intelligence. What is an AI? What is an AI? Agents Environments

CPS331 Lecture: Agents and Robots last revised April 27, 2012

Modeling Player Retention in Madden NFL 11

Overview Agents, environments, typical components

Extending the STRADA Framework to Design an AI for ORTS

High-Level Representations for Game-Tree Search in RTS Games

STRATEGO EXPERT SYSTEM SHELL

CS 380: ARTIFICIAL INTELLIGENCE RATIONAL AGENTS. Santiago Ontañón

Testing real-time artificial intelligence: an experience with Starcraft c

Cooperative Learning by Replay Files in Real-Time Strategy Game

Learning Artificial Intelligence in Large-Scale Video Games

Artificial Intelligence for Games

Bayesian Networks for Micromanagement Decision Imitation in the RTS Game Starcraft

Elements of Artificial Intelligence and Expert Systems

Plan for the 2nd hour. What is AI. Acting humanly: The Turing test. EDAF70: Applied Artificial Intelligence Agents (Chapter 2 of AIMA)

Efficient Resource Management in StarCraft: Brood War

Reactive Strategy Choice in StarCraft by Means of Fuzzy Control

Administrivia. CS 188: Artificial Intelligence Spring Agents and Environments. Today. Vacuum-Cleaner World. A Reflex Vacuum-Cleaner

CS 380: ARTIFICIAL INTELLIGENCE

Potential-Field Based navigation in StarCraft

Asymmetric potential fields

Replay-based Strategy Prediction and Build Order Adaptation for StarCraft AI Bots

Agent. Pengju Ren. Institute of Artificial Intelligence and Robotics

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol

A Learning Infrastructure for Improving Agent Performance and Game Balance

Artificial Intelligence ( CS 365 ) IMPLEMENTATION OF AI SCRIPT GENERATOR USING DYNAMIC SCRIPTING FOR AOE2 GAME

The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games

POKER AGENTS LD Miller & Adam Eck April 14 & 19, 2011

Combining Scripted Behavior with Game Tree Search for Stronger, More Robust Game AI

Hierarchical Controller for Robotic Soccer

Charles University in Prague. Faculty of Mathematics and Physics BACHELOR THESIS. Pavel Šmejkal

Combining Case-Based Reasoning and Reinforcement Learning for Tactical Unit Selection in Real-Time Strategy Game AI

USING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER

Automatically Generating Game Tactics via Evolutionary Learning

Combining Expert Knowledge and Learning from Demonstration in Real-Time Strategy Games

IMPROVING TOWER DEFENSE GAME AI (DIFFERENTIAL EVOLUTION VS EVOLUTIONARY PROGRAMMING) CHEAH KEEI YUAN

Inference of Opponent s Uncertain States in Ghosts Game using Machine Learning

Learning Unit Values in Wargus Using Temporal Differences

The Second Annual Real-Time Strategy Game AI Competition

Towards Adaptive Online RTS AI with NEAT

Annals of the University of North Carolina Wilmington Master of Science in Computer Science and Information Systems

HyperNEAT-GGP: A HyperNEAT-based Atari General Game Player. Matthew Hausknecht, Piyush Khandelwal, Risto Miikkulainen, Peter Stone

Intelligent Agents & Search Problem Formulation. AIMA, Chapters 2,

Learning and Using Models of Kicking Motions for Legged Robots

CS343 Artificial Intelligence

Server-side Early Detection Method for Detecting Abnormal Players of StarCraft

Game-Tree Search over High-Level Game States in RTS Games

CS 680: GAME AI WEEK 4: DECISION MAKING IN RTS GAMES

StarCraft Winner Prediction Norouzzadeh Ravari, Yaser; Bakkes, Sander; Spronck, Pieter

A CBR-Inspired Approach to Rapid and Reliable Adaption of Video Game AI

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Artificial Intelligence

Using Reinforcement Learning for City Site Selection in the Turn-Based Strategy Game Civilization IV

ARTIFICIAL INTELLIGENCE (CS 370D)

ConvNets and Forward Modeling for StarCraft AI

Learning Character Behaviors using Agent Modeling in Games

Agents in the Real World Agents and Knowledge Representation and Reasoning

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Game Artificial Intelligence ( CS 4731/7632 )

Opponent Modelling In World Of Warcraft

MFF UK Prague

TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS

Analysis of Vanilla Rolling Horizon Evolution Parameters in General Video Game Playing

On the Effectiveness of Automatic Case Elicitation in a More Complex Domain

Artificial Intelligence

A Multi-Agent Potential Field-Based Bot for a Full RTS Game Scenario

A Benchmark for StarCraft Intelligent Agents

Artificial Intelligence for Adaptive Computer Games

A Bayesian Model for Plan Recognition in RTS Games applied to StarCraft

Agent-Based Systems. Agent-Based Systems. Agent-Based Systems. Five pervasive trends in computing history. Agent-Based Systems. Agent-Based Systems

Adjustable Group Behavior of Agents in Action-based Games

CS 354R: Computer Game Technology

A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft

CS 680: GAME AI INTRODUCTION TO GAME AI. 1/9/2012 Santiago Ontañón

Adjutant Bot: An Evaluation of Unit Micromanagement Tactics

Project Number: SCH-1102

CS 480: GAME AI INTRODUCTION TO GAME AI. 4/3/2012 Santiago Ontañón

SORTS: A Human-Level Approach to Real-Time Strategy AI

Reflections on the First Man vs. Machine No-Limit Texas Hold 'em Competition

COMP9414/ 9814/ 3411: Artificial Intelligence. Week 2. Classifying AI Tasks

LEARNABLE BUDDY: LEARNABLE SUPPORTIVE AI IN COMMERCIAL MMORPG

Automatic Learning of Combat Models for RTS Games

Transcription:

Integrating Learning in a Multi-Scale Agent Ben Weber Dissertation Defense May 18, 2012

Introduction AI has a long history of using games to advance the state of the field [Shannon 1950]

Real-Time Strategy Games Building human-level AI for RTS games remains an open research challenge StarCraft II, Blizzard Entertainment

Task Environment Properties Chess StarCraft Taxi Driving Fully vs. partially observable Deterministic vs. stochastic Episodic vs. sequential Fully Partially Partially Deterministic Deterministic* Stochastic Sequential Sequential Sequential Static vs. dynamic Static Dynamic Dynamic Discrete vs. continuous Discrete Continuous Continuous Single vs. multiagent Multi Multi Multi [Russell & Norvig 2009]

Motivation RTS games present complex environments and complex tasks Professional players demonstrate a broad range of reasoning capabilities Human behavior can be observed, emulated, and evaluated [Langley 2011, Mateas 2002]

Hypothesis Reproducing expert-level StarCraft gameplay involves integrating heterogeneous reasoning capabilities

Research Questions What competencies are necessary for expert StarCraft gameplay? Which competencies can be learned from demonstrations? How can these competencies be integrated in a real-time agent?

Overview StarCraft Multi-Scale AI Learning from Demonstration Integrating Learning Evaluation

StarCraft Expert gameplay 300+ APM Evolving meta-game Exhibited capabilities Estimation Anticipation Adaptation [Flash, Pro-gamer]

StarCraft Gameplay Expand Tech Tree Attack Opponent Manage Economy Produce Units

Gameplay Scales in StarCraft Individual Worker harassment Squad Global Aggressive mine placement Support siege line

State Space The following number of states are possible, considering only unit type and location: (Type * X * Y) Units States on a 256x256 tile map: (100*256*256) 1700 > 10 11,500

Decision Complexity The set of possible actions that can be executed at a particular moment: O(2 W (A * P) + 2 T (D + S) + B(R + C)) W number of workers A number of the type of worker assignments P average number of workspaces T number of troops D number of movement directions [Aha et al. 2005]

Decision Complexity The set of possible actions that can be executed at a particular moment: Assumption O(W * A * P + T * D * S + B(R + C)) Unit actions can be selected independently Resulting complexity: Assuming 50 worker units on a 256x256 tile map results in more than 1,000,000 possible actions

StarCraft Complex gameplay Real-world properties Highly-competitive Sources of expert gameplay

Research Question #1 What competencies are necessary for expert StarCraft gameplay?

Multi-Scale AI Multiple scales Actions are performed across multiple levels of coordination Interrelated tasks Performance in each tasks impacts other tasks Real-time Actions are performed in real time

Reactive Planning Provides useful mechanisms for building multi-scale agents Advantages Efficient behavior selection Interleaved plan expansion and execution Disadvantages Lacks deliberative capabilities [Loyall 1997, Mateas 2002]

Agent Design Implemented in the ABL reactive planning language Architecture Extension of McCoy & Mateas integrated agent framework Partitions gameplay into distinct competencies Uses a blackboard for coordination [McCoy & Mateas 2008]

EISBot Managers Strategy Manager Income Manager Production Manager Tactics Manager Recon Manager Gather Resources Construct Buildings Attack Opponent Scout Opponent

Multi-Scale Idioms Design patterns for authoring multi-scale AI Idioms Message passing Daemon behaviors Managers Unit subtasks Behavior locking

Idioms in EISBot Initial_tree Tactics Manager Strategy Manager Income Manager Form Squad Attack Enemy Pump Probes Timing Attack WME Probe Stop WME Squad Monitor Legend Subgoal Daemon behavior Squad Attack Squad Retreat Dragoon Dance Message passing

Multi-Scale AI StarCraft gameplay is multi-scale Reactive planning provides mechanisms for multi-scale reasoning Idioms are applied in EISBot to support StarCraft gameplay

Research Question #2 Which competencies can be learned from demonstrations?

Learning from Demonstration Objective Emulate capabilities exhibited by expert players by harnessing gameplay demonstrations Methods Classification and regression model training Case-based goal formulation Parameter selection for model optimization

Strategy Prediction Tasks Identify opponent build orders Predict when buildings will be constructed 400 Spawning Pool Timing 300 200 100 0 0 4 Game Time (minutes) [Hsieh & Sun 2008]

Approach Feature encoding Each player s actions are encoded in a single vector Vectors are labeled using a build-order rule set Features describe the game cycle when a unit or building type is first produced by a player t, time when x is first produced by P f(x) = { 0, x was not (yet) produced by P

Recall Precision Strategy Prediction Results 1 NNge Boosting Rule Set State Lattice 0.8 0.6 0.4 0.2 0 0 1 2 3 4 5 6 7 8 9 10 11 12 Game Time (minutes)

Strategy Learning Task Learn build-orders from demonstration Trace Algorithm Converts replays to a trace representation Formulates goals based on most similar situation q = argmin c ϵ L distance(s, c) g = s + (q - q) [Ontañón et al. 2010]

Trace Retrieval: Example Consider a planning window of size 2 S =< 3, 0, 1, 1 > T 1 =< 2, 0, 0.5, 1 > T 2 =< 3, 0, 0.7, 1 > T 3 =< 4, 1, 0.9, 1 > T 4 =< 4, 1, 1.1, 2 >

Trace Retrieval: Step 1 The system retrieves the most similar case, q S =< 3, 0, 1, 1 > T 1 =< 2, 0, 0.5, 1 > T 2 =< 3, 0, 0.7, 1 > T 3 =< 4, 1, 0.9, 1 > T 4 =< 4, 1, 1.1, 2 >

Trace Retrieval : Step 2 q is retrieved S =< 3, 0, 1, 1 > T 1 =< 2, 0, 0.5, 1 > T 2 =< 3, 0, 0.7, 1 > T 3 =< 4, 1, 0.9, 1 > T 4 =< 4, 1, 1.1, 2 >

Trace Retrieval : Step 3 The difference is computed: T 4 T 2 = <1,1,0.4,1> S =< 3, 0, 1, 1 > T 1 =< 2, 0, 0.5, 1 > T 2 =< 3, 0, 0.7, 1 > T 3 =< 4, 1, 0.9, 1 > T 4 =< 4, 1, 1.1, 2 >

Trace Retrieval : Step 4 g is computed: S =< 3, 0, 1, 1 > T 1 =< 2, 0, 0.5, 1 > T 2 =< 3, 0, 0.7, 1 > T 3 =< 4, 1, 0.9, 1 > T 4 =< 4, 1, 1.1, 2 > g = s + (T 4 T 2 ) = <4, 1, 1.4, 2>

Prediction Error (RMSE) Strategy Learning Results 14 Opponent modeling with a window size of 20 12 10 8 6 4 Null IB1 Trace MultiTrace 2 0 0 10 20 30 40 50 60 70 80 90 100 Actions performed by player

State Estimation Task Estimate enemy positions given prior observations Particle Model Apply movement model Remove visible particles Reweight particles [Thrun 2002, Bererton 2004]

Parameter Selection Free parameters Trajectory weights Decay rates State estimation is represented as an optimization problem Input: parameter weights Output: particle model error Replays are used to implement a particle model error function

Threat Prediction Error State Estimation Results 160 140 120 100 80 60 40 20 0 0 2 4 6 8 10 12 14 16 18 Game Time (Minutes) Null Model Perfect Tracker Default Model Optimized Model

Learning from Demonstration Anticipation Classification and regression models Adaptation Case-based goal formulation Estimation Model optimization

Research Question #3 How can these competencies be integrated in a real-time agent?

Agent Architecture

Integration Approaches Augmenting working memory External Components External plan generation External goal formulation Working Memory

Augmenting Working Memory Supplementing working memory with additional beliefs

External Plan Generation Generating plans outside the scope of ABL

External Goal Formulation Formulating goals outside the scope of ABL

Goal-Driven Autonomy A framework for building self introspective agents GDA agents monitor plan execution, detect discrepancies, and explain failures Implementations Hand-authored rules Case-based reasoning [Molineaux et al. 2010, Muñoz-Avila et al. 2010]

GDA Subtasks Expectation generation Discrepancy detection Explanation generation Goal formulation

Implementation

Integrating Learning ABL agents can be interfaced with external learning components Applying the GDA model enabled tighter coordination across capabilities EISBot incorporates ABL behaviors, a particle model, and a GDA implementation

Evaluation Claim Reproducing expert-level StarCraft gameplay involves integrating heterogeneous reasoning capabilities Experiments Ablation studies User study

GDA Ablation Study Agent configurations Base Formulator Predictor GDA Free parameters Planning window size Look-ahead window size Discrepancy period Discrepancies Explanations Goals Discrepancy Detector Explanation Generator Goal Formulator Goal Manager

GDA Results Overall results from the GDA experiments Agent Win Ratio Base 0.73 Formulator 0.77 Predictor 0.81 GDA 0.92

User Study Experiment setup Matches hosted on ICCup 3 trials Testing script 1. Launch StarCraft 2. Connect to server 3. Host match 4. Announce experiment [Dennis Fong, Pro-gamer]

ICCup Score Performance on Tau Cross 2000 1500 1000 500 Base Formulator Predictor GDA 0 0 10 20 30 40 50 Number of Games Played

ICCup Results Agent Longinus Python Tau Cross Overall Base 942 599 669 737 Formulator 980 718 1078 925 Predictor 1111 555 1145 937 GDA 952 860 1293 1035

EISBot Ranking Rankings achieved by the complete GDA agent Trial Longinus Python Tau Cross Average Percentile Ranking 32 nd 8 th 66 th 48 th

Evaluation Ablation Studies Optimized particle model Complete GDA model Integrating additional capabilities into EISBot improved performance EISBot performed at the level of a competitive amateur StarCraft player

Conclusion Objective Identify and realize capabilities necessary for expert-level StarCraft gameplay in an agent Approach Decompose gameplay Learn capabilities from demonstrations Integrate learned gameplay models Evaluate versus humans and agents

Contributions Idioms for authoring multi-scale agents Methods for learning from demonstration Integration approaches for ABL agents

Integrating Learning in a Multi-Scale Agent Ben G. Weber Ph.D. Candidate Expressive Intelligence Studio bweber@soe.ucsc.edu Funding NSF Grant IIS 1018954

References Aha, Molineaux, & Ponsen. 2005. Learning to Win: Case-Based Plan Selection in a Real-Time Strategy Game, Proceedings of ICCBR. Bererton. 2004. State Estimation for Game AI using Particle Filters, Proceedings of AAI Workshop on Challenges in Game AI. Hsieh & Sun. 2008. Building a Player Strategy Model by Analyzing Replays of Real-Time Strategy Games, Proceedings of IJCNN. Langley. 2011. Artificial Intelligence and Cognitive Systems, AISB Quarterly. Loyall. 1997. Believable Agents: Building Interactive Personalities, Ph.D. thesis, CMU. Mateas. 2002. Believable Agents: Building Interactive Personalities, Ph.D. thesis, CMU.

References McCoy & Mateas. 2008. An Integrated Agent for Playing Real-Time Strategy Games, Proceedings of AAAI. Molineaux, Klenk, Aha. 2010. Goal-Driven Autonomy in a Navy Strategy Simulation, Proceedings of AAAI. Muñoz-Avila, Aha, Jaidee, Klenk, Molineaux. 2010. Applying Goal Driven Autonomy to a Team Shooter Game, Proceedings of FLAIRS. Ontañón, Mishra, Sugandh, Ram. 2010. On-line Case-Based Planning, Computational Intelligence. Russell & Norvig. 2009. Artificial Intelligence: A Modern Approach. Shannon. 1950. Programming a Computer for Playing Chess, Philosophical magazine. Thrun. 2002. Particle Filters in Robotics, Proceedings of UAI.