Creating Human-like AI Movement in Games Using Imitation Learning

Size: px
Start display at page:

Download "Creating Human-like AI Movement in Games Using Imitation Learning"

Transcription

1 DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2017 Creating Human-like AI Movement in Games Using Imitation Learning CASPER RENMAN KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF COMPUTER SCIENCE AND COMMUNICATION

2 Creating Human-like AI Movement in Games Using Imitation Learning May 31, 2017 CASPER RENMAN Master s Thesis in Computer Science School of Computer Science and Communication (CSC) Royal Institute of Technology, Stockholm Swedish Title: Imitation Learning som verktyg för att skapa människolik rörelse för AI-karaktärer i spel Principal: Kristoffer Benjaminsson, Fast Travel Games Supervisor: Christopher Peters Examiner: Olov Engwall

3 iii Abstract The way characters move and behave in computer and video games are important factors in their believability, which has an impact on the player s experience. This project explores Imitation Learning using limited amounts of data as an approach to creating human-like AI behaviour in games, and through a user study investigates what factors determine if a character is human-like, when observed through the characters first-person perspective. The idea is to create or shape AI behaviour by recording one s own actions. The implemented framework uses a Nearest Neighbour algorithm with a KD-tree as the policy which maps a state to an action. Results showed that the chosen approach was able to create human-like AI behaviour while respecting the performance constraints of a modern 3D game.

4 iv Sammanfattning Sättet karaktärer rör sig och beter sig på i dator- och tvspel är viktiga faktorer i deras trovärdighet, som i sin tur har en inverkan på spelarens upplevelse. Det här projektet utforskar Imitation Learning med begränsad mängd data som ett tillvägagångssätt för att skapa människolik rörelse för AI-karaktärer i spel, och utforskar genom en användarstudie vilka faktorer som avgör om en karaktär är människolik, när karaktären observeras genom dess förstapersonsperspektiv. Iden är att skapa eller forma AI-beteende genom att spela in sina egna handlingar. Det implementerade ramverket använder en Nearest Neighbour-algoritm med ett KD-tree som den policy som kopplar ett tillstånd till en handling. Resultaten visade att det valda tillvägagångssättet lyckades skapa människolikt AI-beteende samtidigt som det respekterar beräkningskomplexitetsrestriktioner som ett modernt 3D-spel har.

5 Contents Contents v 1 Introduction Artificial Intelligence in games Imitation Learning Human-likeness Objective Limitations Report outline Background Imitation Learning Policy Demonstration State representation Policy creation Data collection Demonstration dataset limitations Related work Summary and state of the art Performance in games Measuring believability of AI Turing test-approach Automated similarity test Conclusion Implementation Setting Method motivation Implementation Summary Recording movement and state representation Playing back movement v

6 vi CONTENTS Policy Feature extraction Avoiding static obstacles Avoiding dynamic obstacles KD-tree Discretizing the environment Additional details Storing data Optimization and measuring performance Overall implementation Evaluation User study The set-up Participants Stimuli Procedure Hypothesis Results User study Imitation agent performance Discussion The imitation agent The user study Creating non-human-like behaviour Performance in relation to games Ethical aspects Conclusions Future work Use outside of games Bibliography 49

7 Chapter 1 Introduction This chapter gives a brief overview of Artificial Intelligence in games, Imitation Learning and human-likeness. It also presents the objective, limitations and the outline of the project. 1.1 Artificial Intelligence in games Computer and video games produce more and more complex virtual worlds. This introduces new challenges for the characters controlled by Artificial Intelligence (AI), also known as agents [20] or NPC s (Non-Player Characters), meaning characters that are not being controlled by a human player. The way characters move and behave in computer and video games are important factors in their believability, which has an impact on the player s experience. Being able to interact with NPC s in meaningful ways and feel that they belong in the world is important [4]. In Virtual Reality (VR) this is even more important, as the gaming experience is even more immersive. The goal of many games AI is more or less the same as attempts to beat the Turing test - to create believable intelligence [12]. A popular genre in computer and video games is First-person shooter (FPS). In an FPS game the player experiences the game through the eyes of the character the player is controlling, also known as a first-person perspective. Typically a player is at most able to see the hands and arms of the character the player is controlling. The player can however see the whole bodies of characters of other players characters or NPC s (Non-Player Characters). This is visualized in Figure

8 2 CHAPTER 1. INTRODUCTION Figure 1.1: An example first-person perspective game scenario, seen from the eyes of the character that the player controls. The blue and red characters are NPC s. AI in games is traditionally based on Finite State Machines (FSM), Behaviour Trees (BT) or other hand-coded techniques [27]. In these techniques, a programmer needs to explicitly define rules for what an agent should do in different situations. An example of such a rule could be: "if the character s health is low and the character sees a hostile character, the character should flee". These techniques work in the sense that the agent is able to execute tasks and adapt its behaviour to its situation, but the result is predictable and static [11]. For example, if a player sees an NPC react to a situation the same way it did in an earlier similar situation, the player can be quite sure that the NPC will probably always react like that given a similar situation. In 2006, Orkin [17] said: in the early generations of shooters, such as Shogo (1998) players were happy if the A.I. noticed them at all and started attacking.... Today, players expect more realism, to complement the realism of the physics and lighting in the environments. In order to get more realism and unpredictability in order to increase the entertainment for the player, it would perhaps be a good approach for agents to imitate human behaviour Imitation Learning Imitation Learning (IL) is a technique where the agent learns from examples, or demonstrations, provided by a teacher [1]. IL is a form of Machine Learning (ML). ML has been defined as the field of study that gives computers the ability to learn without being explicitly programmed [14]. Unlike Reinforcement Learning algorithms, IL does not require a reward function to be specified. Instead, an IL algorithm observes a teacher perform a task and learns a policy that imitates the teacher, with the purpose of generalizing to unseen data [28]. IL is regarded as a promising technique for creating human-like artificial agents [3]. Some approaches have shown to be able to develop agents with good performance in non-trivial tasks using limited amounts of data and computational resources [3]. It is a technique

9 1.2. OBJECTIVE 3 which also can be used to dynamically change game play to adapt to different players based on their play style and skill [7] Human-likeness Shaker et al. [24] describe character believability, which says that an agent is believable if someone who observes it believes that the agent is a human being. Player believability on the other hand, says that the agent is believable if someone observing the agent believes that it is a human controlling it. It is player believability that is meant by human-like in this project. 1.2 Objective The primary goal of this project is to describe a method for creating human-like agent movement using IL with limited amounts of data. The idea is to create an agent by recording one s own actions, shaping it with desired behaviours. Most related works in the field of IL in games want to create competitive AI, meaning AI that is good at beating the game. This is not the case in this project. The goal is to create AI that lets an agent imitate a demonstrating human well, while respecting the performance requirements of a modern 3D game. A hope is that this will lead to a more unpredictable and human-like agent which in turn could lead to better entertainment for a player playing the game. Lee et al. [9] say that human-like agent behaviour leads to a raised emotional involvement of the player, which increases the players immersion in the game. Whether it is more fun or not to play with a human-like agent will not be explored. This project aims to answer the following question: Q1: How can IL be used to create human-like agent behaviour, using limited amounts of data? This question is further split up into two sub-questions: Q1.1: How to create an agent that imitates demonstrated behaviour, using IL with limited amounts of data? Q1.2: What determines if a character is human-like, when observed through the character s first-person perspective? The human-likeness of the agent will depend on how human-like the human is when recording itself. This means that behaviour that is non-human-like will also be possible to create. Suppose that it is desired to create a behaviour for a dog in a game. A human would then record itself playing the game, role-playing a dog and behaving like it wants the dog to behave. If the intended behaviour is that the dog should flee when it sees something hostile, then so should the human when recording itself. The outcome should then be an agent that behaves like a dog.

10 4 CHAPTER 1. INTRODUCTION 1.3 Limitations By agent movement is meant that the actions that the agent can execute are limited to movement including rotation, i.e. moving from one position to another. As contrast, actions that are not considered movement in this project could for example be shooting, jumping or picking up items. The simulations will be done in a 3D environment but the movement of the implemented agent will be limited to a 2D plane. This means that the agent will not be able to walk up a ramp or climb stairs for example. The movement behaviour of the agent will be limited by the feature extractors implemented, as described in the implementation chapter. In theory, any behaviour which only requires the agent to be able to move could be implemented, like path-finding and obstacle avoidance for example. The project will use limited amounts of data, meaning that it should be possible to create agent behaviour using the framework created in this project, by recording one s own actions for a couple of minutes. The motivation for this is that if game developers should be able to design their own agent behaviour for a game, there will not exist data for them to use. Some works listed in the related works section perform their experiments in big games such as Quake III 1, where there is a lot of saved data available. Quake is a first-person shooter video game. This allows them to use complex algorithms which perform better with more data. Not requiring a lot of data is also thought to make the contributions of this work more attractive to the gaming industry, as it will require less time and effort to be able to utilize. 1.4 Report outline The report starts with presenting background information about the areas of Imitation Learning and measuring believability of AI, and related work. Following is the implementation chapter which motivates the choice of methods and describes the implementation process. The evaluation chapter describes the user study which was conducted in order to evaluate the human-likeness of the resulting imitation agent. It also presents the results of the user study and a brief performance measurement of the imitation agent, as well as summarizes what was done in the project and discusses the results. Finally conclusions are made in the conclusions chapter. 1

11 Chapter 2 Background This chapter presents background knowledge and related works about Imitation Learning and measuring believability of AI controlled characters. It also presents why heavy computations with long computational times are particularly bad in games. 2.1 Imitation Learning The work by Argall et al. [1] is frequently cited and is a comprehensive survey of IL. The survey is the biggest source of background knowledge in the area of IL in this project. They describe IL as a subset of Supervised Learning, where an agent learns an approximation of the function that produced the provided labeled training data, called a policy. The dataset is made up out of demonstrations of a given task Policy A policy π is a function that maps a state x to an action u. A policy allows an agent to select an action based on its current state. Developing a policy by hand is often difficult. Therefore machine learning algorithms have been used for policy development [1] Demonstration A demonstration is a sequence of state-action pairs that are recorded at the time of the demonstration of the desired behaviour [1]. This way of learning a policy through examples, differs from learning it based on data collected through exploration such as in Reinforcement Learning [25]. A feature of IL is that it focuses the dataset to areas of the state-space that is actually encountered during the execution of the behaviour [1]. This is a good thing in games where computation time is very limited, as the search space of appropriate solutions is reduced. 5

12 6 CHAPTER 2. BACKGROUND State representation A state can be represented as either discrete, e.g. can see enemy or cannot see enemy or continuous, e.g. 3D position and rotation of the agent Policy creation Creating a policy can be done in different ways. A mapping function uses the demonstrated data to directly approximate the function mapping from the agent s state observations to actions (f() : Z A) [1]. This can be done using either classification where the output is class labels, or regression where the output consists of continuous values. A system model uses the demonstrated data to create a model. A policy is then derived from that model [1]. Plans use the demonstrated data together with user intention information to learn rules that associate pre- and post-conditions with each action. A sequence of actions is then planned using that information [1] Data collection The correspondence problem [16] has to do with the mapping between the teacher and the learner (see Figure 2.1). For example, a player playing an FPS game using a mouse and keyboard sends inputs which are processed by the game and translated into actions. An NPC in the same game, is controlled by AI which sends commands to control the character several times per second, which is not directly equivalent to keystrokes and mouse movements of a human player. Figure 2.1: Visualization of the record mapping and embodiment mapping. The record mapping is the extent to which the exact states/actions experienced by the teacher during demonstration are recorded in the dataset [1]. If there is no record mapping or a direct record mapping, the exact states/actions are recorded in the dataset. Otherwise some encoding function is applied to the data before storing the data. The embodiment mapping is the extent to which the states/actions recorded within the dataset are exactly those that the learner would observe/execute [1]. If there is no embodiment mapping or a direct embodiment mapping, the recorded states/actions are exactly those that the learner will observe/execute. Otherwise there is a function which maps the recorded states/actions to actions to be executed. Two data collection approaches are demonstration and imitation [1]. In demonstration, the teacher can operate the learner through teleoperation where the record

13 2.2. RELATED WORK 7 mapping is direct. There is also shadowing where the agent tries to mimic the teachers motions by using its own sensors. Here the record mapping is non-direct. Within imitation the embodiment mapping is non-direct, and the teacher execution can be recorded either with sensors on the teacher where the record mapping is direct, or external observation where the record mapping is non-direct Demonstration dataset limitations In IL, the performance of an agent is heavily dependent on the demonstration dataset. Low learner performance can be due to areas of the state space that have not been demonstrated. This can be solved by either improving upon the existing demonstrations by generalizing them or through acquisition of new demonstrations [1]. As mentioned, low performance can also be caused due to low quality of the demonstration dataset [1]. Dealing with this involves eliminating parts of the teacher s executions that are suboptimal. Another solution is to let the learner learn from experience. If feedback is provided on the learners actions, this can be used to update the policy [1]. The demonstration dataset limitations are not dealt with in this project, as it is considered out of scope. It is however mentioned as a possible extension in the Future work chapter. 2.2 Related work This section gives an overview of the related work in the field of Imitation Learning in games in chronological order. Thurau et al. [26] in "Imitation In All Levels of Game AI" create bots for the game Quake II 1. Different algorithms are presented that learn from human generated data. They create behaviours on different levels: strategic behaviour used to achieve long-term goals, tactical behaviour used for localized situation handling such as anticipating enemy movement, and reactive behaviour like jumping, aiming and shooting. The generated bots are compared to the existing Quake II bots. It is shown that Machine Learning can be applied on different behavioural layers. It is concluded that Imitation Learning is well suited for generating behaviour for artificial game characters. The bots created with Imitation Learning outperformed the Quake II bots. It should however be taken into consideration that these results are thirteen years old at the time of writing this report. Priesterjahn et al. [20] in "Evolution of Reactive Rules in Multi Player Computer Games Based on Imitation" propose a system in which the behaviour of artificial opponents is created through learning rules by observing human players. The rules are selected using an evolutionary algorithm with the goal of choosing the best and most important rules and optimizing the behaviour of the agent. 1

14 8 CHAPTER 2. BACKGROUND The paper shows that limited learning effort is needed to create behaviour which is competitive in reactive situations in the game Quake III. After a few generations of the algorithm, the agent was able to behave in the same way as the original players. In the conducted experiments, the generated agent outperformed the built in game agents. The world is simplified to a single plane. The plane is divided into cells in a grid, with the agent centered in the grid. The grid moves relative to the agent. Each frame, the agent checks each cell if it is empty or not and scores it accordingly. They limit the commands to moving and attacking or not attacking. A rule is a mapping from a grid to a command. Human players are recorded and a basic rule set is generated by recording the grid-to-command matches every frame of the game. An evolutionary algorithm is then used to learn the best rules and thus the best competitive behaviour. Saunders et al. [21] in "Teaching Robots by Moulding Behavior and Scaffolding the Environment" teaches behaviour to robots by moulding their actions within a scaffolded environment. A scaffolded environment is an environment which is modified to make it easier for the robot to complete a task, when the robot is at a developmental stage. Robot behaviour is created by teaching state-action memory maps in a hierarchical manner, which during execution are polled using a k-nearest Neighbour based algorithm. Their goal was to reproduce all observable movement behaviours. Their results show that the Bayesian framework leads to human-like behaviour. Priesterjahn [19] in "Imitation-Based Evolution of Artificial Players in Modern Computer Game" which is based on the paper by [20], proposes the usage of imitation techniques to generate more human-like behaviours in an action game. Players are recorded, and the recordings are used as the basis of an evolutionary learning approach. The approach is motivated by stating that to behave human-like, an agent should base its behaviour on how human players play the game and try to imitate them. This as opposed to a pure learning approach based on the optimization of behaviour, which only optimizes the raw performance of the game agent. The authors present the result of the conducted experiments and explain that the imitation-based initialization has a big effect on the performance and behaviour of the evolved agents. The generated agents showed a much higher level of sophistication in their behaviour and appeared much more human-like than the agents evolved using plain evolution, though performing worse. Cardamone et al. [3] in "Learning Drivers for TORCS through Imitation Using Supervised Methods" develop drivers for The Open Racing Car Simulator (TORCS) using a direct method, meaning the method uses supervised learning to learn driving behaviour from data collected from other drivers. They show that by using highlevel information about the environment and high-level actions to be performed, the developed drivers can achieve good performance. High-level actions mean that they learn trajectories and speeds along the track, and let controllers achieve the target

15 2.2. RELATED WORK 9 values. This as opposed to predicting/learning low-level actions such as pressing the gas pedal an amount, or rotate the wheel an amount of degrees. It is also stated that the performance can be achieved with limited amounts of data and limited computational power. The learning methods used are k-nearest Neighbour and Neural Networks with Neuroevolution. The performance is measured in how fast a driver completes a race, which means they want to create an AI that is good at playing the game. It is compared to the best AI driver. Munoz et al. [15] in "Controller for TORCS Created by Imitation" create a controller for the game TORCS using Imitation Learning. They use three types of drivers to imitate: a human player, an AI controller created with Machine Learning and one hand-coded controller which performs a complete lap. The imitation is done on each of the drivers separately and then a mix of the data is combined into new controllers. The aim of the work is to create competitive NPCs that imitate human behaviour. The learning method is feed-forward Neural Networks with Backpropagation. The performance of the driver is measured by how fast a driver completes a race. It is compared to other AI and human drivers. They conclude that it is difficult to learn from human behaviour, as humans do not always perform the same actions given the same situation. Humans also make mistakes, which is not good behaviour to learn if the goal is to create a driver that is good at playing the game. Mehta et al. [13] in "Authoring Behaviors for Games using Learning from Demonstration" is similar to [21] in that behaviour is taught by demonstrating actions and annotating the actions with a goal. Here, the learning involves four steps: Demonstration: Playing the game. Annotation: Specifying the goals the teacher was pursuing for each action. Behaviour learning: Using a temporal reasoning framework. Behaviour execution: Done through a case-based reasoning (CBR) technique, case-based planning. The goal of this project was to create a framework in which people without programming skills can create game AI behaviour by demonstration. The authors conclude that by using case-based planning techniques, concrete behaviours demonstrated in concrete game situations can be reused by the system in a range of other game situations, providing an easy way to author general behaviours. Karpov et al. [8] in "UT 2 : Believable Bot Navigation via Playback of Human Traces" create the UT 2 bot for the BotPrize competition 2, a Turing-like test where 2

16 10 CHAPTER 2. BACKGROUND computer game bots compete by attempting to fool human judges into thinking they are just another human player. UT 2 broke the 50% humanness threshold and won the grand prize in The bot has a component called the Human Trace Controller, which is inspired by the idea of direct imitation. The controller uses a database of recorded human games in order to retrieve and play back segments of human behaviour. The results show that using direct imitation allows the bot to solve navigation problems while moving in a human-like fashion. Two types of data are recorded, pose data and event data. The pose includes position, orientation, velocity and acceleration. An event is for example switching weapons, firing weapons or jumping. All of the pose and event data for a player in a particular game form a sequence. Sequences are stored so that preceding and succeeding event and pose data can be retrieved from any given pose or event. In order to be able to quickly retrieve the relevant human traces, they implemented an efficient indexing scheme of the data. The two most effective indexing schemes used were Octree based indexing and Navigation Graph based indexing using a KDtree. Ortega et al. [18] in "Imitating Human Playing Styles in Super Mario Bros" describe and compare different methods for generating game AI based on Imitation Learning. Three different methods for imitating human behaviour are compared: Backpropagation, Neuroevolution and Dynamic scripting. The game is in 2D. Similarity in playing style is measured through comparing the play trace of one or several human players with the play trace of an AI player. The methods compared are hand-coded, direct (based on supervised learning) or indirect (based on maximizing a similarity measure). The conclusion is that a method based on Neuroevolution performs best both when evaluated by the similarity measure and by human spectators. Inputs were the game state, e.g. enemies, obstacles and distance to gaps and outputs were actions Summary and state of the art In 2006, Gorman et al. [6] stated that every particular game is different from the other, and claimed that it thus probably is impossible to suggest an ultimate approach. They said that "Currently, there are no generally preferred knowledge representation data structures and machine learning algorithms for the task of creating believable behaviour". They claim that believable characters should possess certain features, that hardly can be achieved without observing and/or simulating human behaviour. Imitation Learning is listed as a proven human behaviour acquisition method. Few of the works listed here have the sole aim of creating an agent that imitates demonstrated behaviour as well as possible, and no such works could be found. Most

17 2.3. PERFORMANCE IN GAMES 11 have another aim, such as performing as well as a human, or performing well after being inspired by human behaviour. The most popular and successful approach in these works are using Neural Networks with Neuroevolution, which is a form of Machine Learning that uses evolutionary algorithms to train Neural Networks [18]. The Human Trace Controller in the work by Karpov et al. [8] however, is the most recent and successful found work which aims to imitate demonstrated behaviour, without doing it in a "beating the game"-manner. 2.3 Performance in games In games it is important to keep computational times low and a high and stable frame rate, usually measured in frames per second (FPS). The frame rate is the frequency at which frames (images) in a game (or video) are displayed. A high frame rate typically means about 60 FPS for normal computer games and about 90 FPS for VR games, in order to have objects on the screen appear to move smoothly. Games usually contain a function called the update or tick function, which runs once every frame. The game will wait for the update function to finish before processing the next frame. If the calculations made in the update function take longer than the time slot for one frame (in order to keep 90 FPS, one frame has 1/ ms to run its calculations) the game will not be able to stay at its target FPS and will not run as smoothly. 2.4 Measuring believability of AI Umarov and Mozgovoy [27] study current approaches to believability and effectiveness of AI behaviour in virtual worlds and gives a good overview of different approaches. They talk both about measuring believability as well as various implementations for achieving it in games. It is stated how believability is not the only feature that makes AI-controlled characters fun to play with. A game should be challenging, so the agent should also be skilled or effective. However, they explain that the goals of believability and effectiveness are not always the same. A skilled agent is not necessarily believable, and a believable agent might be a weak opponent Turing test-approach To evaluate the believability of an AI controlled character, Umarov and Mozgovoy [27] refer to a Turing test-approach, where a human player (judge) plays a game against two opponents, where one opponent is controlled by a human and one is controlled by an AI. The judge s task is to determine which one is human. A simplification of this test is also mentioned, where the judge instead watches a game between two players which both can be controlled either by a human or an AI. The judge s task is

18 12 CHAPTER 2. BACKGROUND to identify game participants. Lee et al. [9] learn human-like behaviour via Markov decision processes in the 2D game Super Mario. They evaluate the human-likeness by performing a modified Turing test [22] as well. Gorman et al. [6] performed an experiment which [27] refers to. Quake II agents were evaluated by showing a number of people a series of video clips as seen by the characters first-person camera. The task was to identify whether the active character is human. The different characters were controlled by a real human player, a Quake agent and a specifically designed imitation agent that tried to reproduce human behaviour using Bayesian motion modeling. The imitation agent was misidentified as a human 69% of the time and the Quake agent was mistaken as a human 36% of the time. "Sample evaluators comments, quoted in (Gorman et al., 2006), indicate that quite simple clues were used to guess human players ( fires gun for no reason, so must be human, stand and wait, AI wouldn t do this, unnecessary jumping )" Automated similarity test One way to compare human player actions and agent actions, is by comparing velocity direction angle changes and frequencies of angles between player direction and velocity direction. Another is to compare pre-recorded trajectories of human players with those of agents [27]. 2.5 Conclusion This chapter presented Imitation Learning and the different challenges that it involves. Then related works were listed and the state of the art was determined. It seems like a direct imitation method is a good approach as used by Karpov et al. [8]. Since no learning is done the approach should give a lot of control, which is good as the computational performance of AI in games is important. The choice of method is described in detail in the next chapter. In order to evaluate the believability of the agent, a Turing test-approach is described as an option. The evaluation is described in Chapter 4.

19 Chapter 3 Implementation This chapter describes the implementation of the Imitation Learning framework and thereby aims to answer Q1.1. Section provides a summary of what was implemented. Throughout an iterative implementation process it was determined what to implement, in order to create an agent with behaviour which can be evaluated. The agent created in this process will be referred to as the agent when no other type of AI controlled character is in the same context. Otherwise it will be referred to as the imitation agent. 3.1 Setting The implementation was carried out in the Unity Pro game engine 1. Unity is a cross-platform game engine developed by Unity Technologies and used to develop video games for PC, consoles, mobile devices and websites. 3.2 Method motivation To keep the complexity of the framework low, and to allow for quick evaluation and iteration, it was decided to go with a Nearest Neighbour (NN) classification approach as used by Cardamone et al. [3]. Policy creation is thus done through a mapping function. No learning is done, and the collected data represents the model. Argall et al. [1] state that regardless of learning technique "minimal parameter tuning and fast learning times requiring few training examples are desirable". This speaks against more sophisticated algorithms such as Neural Networks, which require a lot of data to perform well. Cardamone et al. [3] claim that it is desirable to have the output of the agent be high-level actions, such as a target position and velocity as opposed to low-level actions such as a certain key press for a certain

20 14 CHAPTER 3. IMPLEMENTATION amount of time. Other classification techniques may perform as well or better than Nearest Neighbour algorithms, but the focus of the thesis is not to compare or find the best classification algorithm. It is however important that the algorithm is fast, as there is not much time for heavy calculations in a game. Karpov et al. [8] show that using direct imitation, i.e. playing back recorded segments of human gameplay as they were recorded, allows the bot to solve navigation problems while moving in a human-like fashion. Their work passes the test of a structured and recognized competition aimed at measuring human-likeness, which gives the work high credibility. It is also one of the most recent works. This project was therefore inspired by their solution. The implementation used imitation as the data collection approach, where the record mapping is direct and the embodiment mapping is indirect. This is described in more detail in the next section. 3.3 Implementation Summary An Imitation Learning framework was created which allows a human to create human-like agent behaviour by recording its own actions. Below is a summary of the implementation of the imitation agent. Details are described in the subsections following this summary. Recording movement: The human is in control of the agent and the agent s state is continuously recorded. Playing back movement: The agent moves by executing actions. An action is a set of states. An action is chosen by classifying the agent s state and weighing actions. Classification is done using a Nearest Neighbour algorithm. Feature extraction: The agent uses sensors to sense the environment. Reading the sensors results in a feature vector that is a representation of the environment. Avoiding static obstacles: If there is recorded data which corresponds to the agent s current state, the agent will be able to avoid obstacles by executing the nearest neighbour action. If that is not the case, static obstacles are avoided by checking if an action goes through a static obstacle or not in the Nearest Neighbour algorithm. If it does the action is not considered a near neighbour and is not chosen. Avoiding dynamic obstacles: Dynamic obstacles are avoided like static obstacles, but a different feature extractor is utilized which extracts different features. The dynamic obstacle avoidance was the last part of the implementation process.

21 3.3. IMPLEMENTATION 15 KD-tree: A KD-tree is used to speed up the Nearest Neighbour algorithm. Grid: The environment is discretized into a grid of cells. The grid is used in weighing actions. An action is weighted with the score a cell. The grid can be manipulated to make the agent move to a destination Recording movement and state representation Figure 3.1: Flowchart visualizing the record mode. The agent can be in either Record or Playback mode. During recording, a human is in control of the agent from the agent s first-person perspective using a mouse and keyboard. The record mapping was direct, meaning that the exact states/actions were recorded in the dataset. Data was recorded when the direction vector of the agent changed, and the distance between the agent s current position and the last

22 16 CHAPTER 3. IMPLEMENTATION recorded position was bigger than a set threshold. The policy that an IL algorithm is meant to learn, maps a state x to an action u. Adopting this terminology, one record of data was structured as a state. Several states make up an action. A state consists of two parts. The first part is the agent s position, rotation and direction (i.e. the agent s forward vector), called the pose state. The state representation is thus continuous. The pose state also contains the time passed between the previous state and the current state. The second part is a feature vector of floats, corresponding to a representation of the environment at the current pose state. This second part is called the sensor state. How the sensor state is created is explained in further detail in the section Avoiding static obstacles. Karpov et al. [8] similarly use sequences of states for representing the stored human traces, separating them into a pose state and an event state. The data is stored by writing all recorded states as binary data to a file. When more data is recorded, the data is appended to the existing file. Figure 3.2 shows one environment, or scene, used during development at an early stage of the implementation process. The aim here was to play back recorded data by having the agent move to the closest position in the recorded data. Figure 3.2: The scene. Recorded trajectory data in black and the agent s trajectory in blue Playing back movement During Playback the agent moves on its own by executing actions. Executing an action means moving from one recorded pose state to the next, interpolating between states to achieve a position and rotation that approximate the recorded data. This interpolation/approximation is a form of embodiment mapping, as the agent maps the recorded data into movement. The embodiment mapping was therefore nondirect, meaning that the recorded states/actions were not exactly those that the agent would execute. To find an action to execute, the agent s sensor state is classified using a NN algorithm. The algorithm returns the nearest recorded action to the agent s current sensor state. This action is then applied relative to the agent s current pose state so that the action s first state is the same as the agent s current

23 3.3. IMPLEMENTATION 17 rotation. To create smooth rotation between states the following was done: Suppose that the agent is at the first state a where it has the correct rotation r 1, and the next pose state is b containing rotation r 2. When moving from a to b, the rotation of the agent is set to be the value of the interpolation between r 1 and r 2 by the distance traveled from a to b. Upon reaching b the rotation is therefore r 2. Slight errors in the imitation occurs here, since the human most likely did not rotate at a constant speed when demonstrating. However, making the distance between states short made it hard to tell a difference when observing the agent. When the agent has finished executing an action, meaning it has reached the final pose state position of an action, the process is repeated by classifying the sensor state again Policy A policy is a function that maps a state to an action. The NN algorithm receives a state as input, efficiently finds the best action with the KD-tree data structure and returns it. Thus the NN algorithm with the KD-tree can be said to be the policy Feature extraction An IL algorithm learns a policy that imitates the teacher, with the purpose of generalizing to unseen data. In order to generalize, the agent had to sense its environment and represent it in a way which allows for recognizing similar states. The feature extraction process uses sensors on the teacher to sense the environment and represents it as a vector of floats, called the feature vector or simply the features. When recording a state or classifying a state, the sensor state is created by extracting features for the agent s current pose state Avoiding static obstacles In many games, a desirable skill for an agent to have is to be able to avoid obstacles, so called obstacle avoidance. In order to be able to avoid static (non-moving) obstacles, such as walls, sensors were implemented similar to the ones used by the authors of [8] in [23]. They show a figure similar to Figure 3.3a which represents the sensors they use on their Quake III bot. Their motivation was that there are more sensors near the front so that the agent can better distinguish locations in front of it.

24 18 CHAPTER 3. IMPLEMENTATION (a) (b) Figure 3.3: Sensors similar to those used by Schrum et al. [23] (a) were added to the agent (b). The feature extractor creates the sensor state by ray casting in all sensor directions using Unity s function Physics.Raycast. The function returns information about what was hit, including the distance to the hit obstacle/collider. This results in a feature vector v containing the distances x1,..., x6 to obstacles in the different directions. Figure 3.4 shows how data could be recorded in one environment (Figure 3.4a) and played back in another (Figure 3.4b), thus showing that the approach generalizes to new environments. (a) (b) Figure 3.4: Recorded traces in black, the chosen action in green, the chosen action applied to the agent in blue and sensors in white.

25 3.3. IMPLEMENTATION 19 Figure 3.4b shows how the agent currently is in the top right corner. When classifying its state, it is determined that an action should be chosen as if it currently was in the lower left corner (the action is highlighted in green). This makes sense, as it is a similar situation. If there is recorded data which corresponds to the agent s current state, the agent will be able to avoid obstacles by executing the nearest neighbour action. However, that may not always be the case, as there probably will not be recorded data for every possible state. Therefore the NN algorithm checks if actions go through a static obstacle, and if so does not consider them near neighbours and they will not be chosen Avoiding dynamic obstacles Another common task for game AI is to be able to avoid moving (dynamic) obstacles. A new feature extractor was created which sensed the environment in a different way. The area within a certain radius around the agent was sensed with the purpose of sensing moving obstacles, visualized in Figure 3.5a. To be able to recognize a state correctly, it was needed to be able to differentiate between obstacles moving in different directions. For example if an obstacle is close and headed straight towards the agent, the agent should probably dodge the obstacle somehow. If the obstacle is headed away from the agent however, no particular action needs to be taken. Intuitively when an agent should avoid an obstacle, it would be important to know: How close is the obstacle to the agent? Is the obstacle moving towards or or away from the agent? Will the obstacle hit the agent if the agent does not move? What is important is to be able to distinguish one state from another. The resulting extractor extracts three features per moving obstacle within the sensor. This is described in Algorithm 1 and visualized in Figure 3.5b. The features are

26 20 CHAPTER 3. IMPLEMENTATION Algorithm 1 Dynamic obstacle extractor 1: function ExtractFeatures(agent) 2: Sort obstacles in sensor by distance 3: for each moving obstacle obstacle at index i in sensor do 4: velocitysimilarity dot(agent.velocity, obstacle.velocity) 5: sqrdist sqrdist(agent, obstacle) 6: diffvector obstacle.position - agent.position 7: velpossimilarity dot(diffvector, agent.velocity) 8: 9: features[3 * i] velocitysimilarity 10: features[3 * i + 1] sqrdist 11: features[3 * i + 2] velpossimilarity (a) (b) Figure 3.5: The new sensor (a) and visualization of the vectors used in calculating features for the dynamic obstacle extractor (b). The velocitysimilarity is the dot product of the agent velocity and the obstacle velocity. It will tell whether an obstacle is heading in the same direction as the agent or not. velpossimilarity is the dot product between the diffvector and the agent s velocity. This value says whether the obstacle lies in the agent s current path or not. If this value is 1 it means that the two vectors are in the same direction. This means that the agent is headed straight towards the obstacle. sqrdist could act as a weight for how crucial the situation is. The proposed approach is by no means the correct or the best solution. Different approaches similar to the above were tried, but these values were able to distinguish the agent s state the best out of the tried values. Using this with recorded data containing around 100 actions demonstrating how to avoiding a single obstacle, the

27 3.3. IMPLEMENTATION 21 agent was able to avoid a single obstacle efficiently. Attempts were also made with more obstacles at the same time. In many situations, the agent would avoid obstacles well, but in some it would not. In theory, like with static obstacle avoidance, if there is data for every situation, the feature extractor separates different situations well and the quality of the data is good, then the agent should be able to always avoid obstacles. Good data is meant in the sense of the current goal behaviour. If the goal behaviour is obstacle avoidance, the data is good if the recorded human performed good/avoiding actions and did not walk into an obstacle while recording. (a) t = 1 (b) t = 2 Figure 3.6: The agent avoiding an obstacle (blue square) moving in the opposite direction. The blue curve is the agent s chosen action trajectory that it chose at t = 1 when it sensed the obstacle. At t = 2 the agent has moved further along the trajectory and the obstacle has moved further to the right KD-tree It was decided to implement a data structure to make the NN algorithm more efficient. Karpov et al. [8] use a KD-tree as one of their approaches to efficiently retrieve recorded data. KD-tree is a common approach to making NN algorithms more efficient. Weber et al. [29] showed that if a nearest neighbour approach is used in a space of magnitude higher than ten dimensions, it better to use a naive exhaustive search. The reason is that the work of partitioning the space becomes more expensive than the similarity measure. The number of features was six (distance to walls in six directions), which is less than ten, so a KD-tree should speed up the NN algorithm. A KD-tree is a space-partitioning data structure for organizing points in k-dimensional space. During construction, as one moves down the tree, one cycles through the axes used to select the splitting planes that divide the space. In the case of a twodimensional space, this could be the x and y coordinates (Figure 3.7). Points are inserted by selecting the median point from the list of points being inserted, with

28 22 CHAPTER 3. IMPLEMENTATION X root (7, 2) Y (5, 4) (9, 6) X (2, 3) (4, 7) (8, 1) (2, 7) Figure 3.7: The points (7, 2), (5, 4), (2, 3), (4, 7), (9, 6), (8, 1), (2, 7) inserted in the KD-tree. respect to the coordinates in the axis being used. If one starts with the x axis, the points would be divided into the median point with respect to the x coordinate and two sets: the points with an x coordinate less than the median and the points with an x coordinate bigger than the median. Then, recursively the two sets do the same thing, cycling on to the next axis (y). This would correspond to cycling through the features representing distances to walls in different directions. Algorithm 2 describes the construction of the KD-tree. Algorithm 2 Construction of the KD-tree 1: function BuildTree(actions, depth = 0) 2: dimensions numfeatures(actions) 3: axis depth % dimensions 4: sort(actions) by comparing feature[axis] for actions 5: median median element in sorted actions 6: if median is the only element then 7: return TreeNode(median, null, null, axis) 8: a actionsbeforemedian 9: b actionsaftermedian 10: return TreeNode(median, BuildTree(a, depth + 1), 11: BuildTree(b, depth + 1), axis) The nearest neighbour algorithm using the KD-tree is described in Algorithm 3. The search time is on average O(log n).

29 3.3. IMPLEMENTATION 23 Algorithm 3 The Nearest Neighbour algorithm 1: function NN(node, inputstate, ref nearestneighbour, ref nearestdist) 2: if node is null then 3: return 4: searchpointaxisvalue inputstate[node.axis] 5: dist 6: nodeaxisvalue, index 0 7: 8: // Determine how near current action is to input 9: for state s at index i in node.action do 10: if dist(inputstate, s) < dist then 11: dist dist(inputstate, s) 12: nodeaxisvalue node.action.state(i)[node.axis] 13: index i 14: if node.leftchild is null && node.rightchild is null then 15: return 16: 17: // Applying the action on the current state 18: appliedaction applyactiononstate(inputstate, node.action) 19: 20: // Let calling model weigh action (it may i.e. go through an obstacle) 21: weight weighaction(callingmodel, appliedaction) 22: dist weight 23: 24: // Determine the nearest side to search first 25: nearestside, furthestside null 26: if searchpointaxisvalue < nodeaxisvalue then 27: nearestside node.leftchild 28: furthestside node.rightchild 29: else 30: nearestside node.rightchild 31: furthestside node.leftchild 32: NN(nearestSide, inputstate, nearestneighbour, nearestdist) 33: if dist < nearestdist then 34: // Update nearest neighbour as recursion unwinds 35: nearestneighbour node.action 36: nearestdist dist 37: 38: // Check if it is worth searching on the other side 39: nearestaxisvalue nearestneighbour.state(index)[node.axis] 40: splittingplanedist dist(inputstate, splittingplane) 41: nearestneighbourdist dist(inputstate, nearestneighbour) 42: if splittingplanedist < nearestneighbourdist then 43: NN(furthestSide, inputstate, nearestneighbour, nearestdist)

30 24 CHAPTER 3. IMPLEMENTATION Following is a short and slightly simplified explanation of the algorithm. An extended description of how the algorithm works can for example be found in the Wikipedia article 2. The algorithm recursively moves down the tree, starting from the root. When it reaches a leaf, that leaf is set as the current best. As the recursion unwinds, each node compares its distance to the input to the current best. If the distance is smaller than the current best, then the node is set to the current best. It also checks whether it is possible that a nearer neighbour can be on the other side of a node. If the distance between the current best node and the input search point is bigger than the distance from the input search point to the current node, then there might be a nearer neighbour on the other side of the current node, so that side is searched. When the search reaches the root node, the search is done Discretizing the environment In games, it is desirable to be able to tell an AI to go to a position. This diverges from the Imitation Learning, as the sensor state is not used to decide what action to execute. Instead an external input says what position to go to. It was decided to implement it however, for the sake of practical usability. One could argue that the agent still moves in a human-like fashion, as it executes actions the same way the actions were recorded, and the only way for the agent to move is by executing actions. A first approach in making the agent go to a goal was to weigh the actions by how close an action would take the agent towards the goal. This worked to some extent, but the agent did not register where it had been or if it walked into a dead end. This resulted in it sometimes walking around in the same area for a long time, without realising that it did not get closer to the goal. The phenomenon is shown in Figure 3.8. It was therefore concluded that some sort of path finding was needed and that it would help to be able to say if a position on a map was good or bad, or close to the goal or not. 2

31 3.3. IMPLEMENTATION 25 Figure 3.8: Problem with getting stuck. The blue lines show traces of the agent trying to get to the white goal. Priesterjahn et al. [20] used a grid to represent a state in their Neuroevolution approach. Inspired by them, the map was discretized into a grid of cells where each cell had a score which represented the distance from the cell to the goal. Actions were then weighted by the score of the cell that the action ended up in. A lower score means closer to the goal (greener in Figure 3.9a). As the agent moved around the map, the score of the nine adjacent cells to the agent were increased, thus decreasing the chance of picking an action which ended up in one of those cells again. Spending time in a corner would result in those cells getting a higher score, which would lead to the agent not going there again. This is visualized in Figure 3.9.

32 26 CHAPTER 3. IMPLEMENTATION (a) The grid. (b) t = 1 (c) t = 2 (d) t = 3 Figure 3.9: As cells are visited, their scores are increased. This approach solved the problem of the agent getting stuck in corners or close to the goal but on the wrong side of a wall. This was however more of an exploring approach, which could be used if the agent does not know where the goal is. Unless the agent is meant to be blind, this strategy would need to be improved by scoring cells which the agent can see. Telling the agent to go to a position means that the agent knows where the goal is. Therefore a better path finding strategy was implemented. Using the classic A* algorithm 3, the grid would calculate the shortest path from the agent to the goal, and score each cell the shortest path touches with its path distance to the goal. Other cells were scored with a bad score. This is visualized in Figure The grid is the tool a programmer/user would use to influence what the NN algorithm should consider a good action to be. In the NN algorithm, actions are weighted according to the cell score at the action s last pose state position. 3

33 3.3. IMPLEMENTATION 27 Figure 3.10: Cells that touch the A* path from the agent to the goal are scored with a low score (green) Additional details The length of an action could be chosen, which would split up the recorded data into actions of the given length. States in an action were recorded in sequence after each other, so while executing an action, the agent moves like the human who recorded itself did. Choosing a big action length would result in long actions, and thus longer continuous segments of the agent behaving human-like. The downside of long actions is that they might not be able to get the agent out of certain situations without hitting an obstacle. They may also take the agent to worse locations. If there is no recorded data similar to the agent s current state, the returned action probably does not suit the situation well. A longer action would then result in a bigger bad investment, whereas a shorter action would be able to re-classify the state sooner and hopefully get a better suiting action. Short actions however would result in shorter continuous segments of the agent behaving human-like. It would also require the state to be classified more often, which has an impact on the performance. Classifying often however, increases the chance of choosing a correct action for the situation. An action length that was somewhere in between long and short was chosen at first. Later, support for splitting up the data into several action lengths at the same time was implemented. This would help by making long actions available for areas without obstacles and short actions available for trickier situations. In practice, for an AI to be useful in a game, it should be possible to define different types of behaviour and be able to switch between them depending on the situation. The implementation was structured to allow for several types of actions and models, resulting in a loop described in Algorithm 4. Data was recorded separately for each

34 28 CHAPTER 3. IMPLEMENTATION behaviour. Algorithm 4 The agent loop 1: function Update 2: if recording then 3: // Recording 4: features featureextractor.extractfeatures(agent) 5: recorder.record(agent, features) 6: else 7: // Playback 8: if action is done executing or was aborted then 9: features featureextractor.extractfeatures(agent) 10: action model.classify(agent, features) 11: else 12: action.execute(agent, destination) The agent used a controller for deciding which feature extractor to use. When a dynamic obstacle would come within a certain distance, the agent would switch to the feature extractor for dynamic obstacle avoidance with the corresponding recorded actions. Otherwise it would use the static obstacle avoidance model Storing data The recorded data was stored as raw binary data. A file containing data for 1000 recorded actions á 50 states per action corresponding to about 25 minutes of recording has a size of approximately 3 Mb. The stored data per state (pose + sensor state) is described in Table 3.1. Pose state Sensor state Vector3 position Quaternion 4 rotation Vector3 direction Delta time 5 Feature vector float pos x float rot x float dir x float time float n 0 float pos y float rot y float dir y... float pos z float rot z float dir z float n numfeatures float rot w Table 3.1: The data stored for one state The time between the previous state and this state.

35 3.4. OVERALL IMPLEMENTATION Optimization and measuring performance For usage in a proper game, the computational time of the AI should be as low as possible. The bottleneck was to apply an action on the agent s current state in the NN algorithm since it was checked for each traversed action if it would go through an obstacle if applied to the agent s current state. This was improved by instead of checking for collision between every state in an action, the check was approximated by only checking for collision between the first state and the middle state, and middle state and the last state in an action. To ensure the agent did not get stuck by picking an invalid action, it was forced to update its current action at a certain time interval. The performance of the imitation agent was measured by measuring the average computational time per game frame for different amounts of data; 100, 200, 500 and 1000 recorded actions with an action length of recorded actions correspond to about 25 minutes of recording. 3.4 Overall implementation The framework allows a user to create an agent which imitates demonstrated movement behaviour. To create a behaviour, a user creates a feature extractor which defines what environmental features should be classified. The user then chooses when the behaviour should be activated. The user collects data for the behaviour by recording itself. Finally the behaviour can be played back. An agent can possess several behaviours at once, and it is up to the user to define when which behaviour should be activated. This chapter described how IL can be used to create an agent that imitates human demonstrations using a direct imitation approach and limited amounts of data. In the next chapter, the evaluation of the imitation agent is described.

36

37 Chapter 4 Evaluation This chapter presents the user study that was conducted in order to answer the project s stated questions. The results of the study are presented thereafter along with a performance measure of the imitation agent. Following that is a discussion section which presents and discusses what was done in the project, what the study found to be important in looking human-like and the performance of the imitation agent in relation to games. Finally some ethical aspects are discussed. 4.1 User study Recall that the objective of the project (see Section 1.2) is to answer the following: Q1.1: How to create an agent that imitates demonstrated behaviour, using IL with limited amounts of data? Q1.2: What determines if a character is human-like, when observed through the character s first-person perspective? A user study was conducted in order to answer Q1.2 and to contribute to the answer to Q1.1 by asking humans how well imitation agent imitates demonstrations. The method chapter describes how IL can be used to create behaviour by imitating recorded human behaviour, but no evaluation of whether the behaviour is humanlike or not. The user study aimed to evaluate the human-likeness of the agent and to evaluate in a qualitative manner how well the agent imitates the recorded human. As a reminder, an agent is said to be human-like if it looks like it is being controlled by a human. The layout of the study was inspired by [27] which as presented in the background chapter describe a simplification of a Turing test-approach. It was also inspired by [9] which gave users statements to agree or disagree with. 31

38 32 CHAPTER 4. EVALUATION The set-up The study consisted of videos of three different character controllers: The imitation agent, a human and Unity s built in NavMeshAgent. These controllers will be labeled Imitation Controller (IC), Human Controller (HC) and NavMesh Controller (NC) respectively. The human provided the demonstrations for the imitation agent to imitate. The NC was intended to act as a sanity check. A person with a lot of gaming experience would be able to easily tell that the NC was not being controlled by a human, as it moves very statically, does no unexpected movements and turns with a set speed. Three different settings were set up: Setting 1 A simple environment like during development (Figure 4.1). When the character reaches the goal, the goal gets randomly positioned somewhere on the map. Figure 4.1: Setting 1. Setting 2 An even simpler environment but with a single moving obstacle (Figure 4.2).

39 4.1. USER STUDY 33 Figure 4.2: Setting 2 with a moving obstacle (blue) and the goal (white). Setting 3 Same concept as Setting 1, but different map (Figure 4.3). Here, the goal positions were deterministic, meaning that when the character reaches the goal, the goal gets positioned at the next index in the goal positions list. This means that all characters take the same path. (a) (b) Figure 4.3: Setting 3 from a top-down view with the corresponding first-person perspective. One video was recorded for each of the settings and for each character controller, resulting in a total of nine videos. The videos were recordings of the controllers moving around in the three different settings, from a first-person perspective (Figure 4.3b). In most games, a player would observe an NPC from a third-person perspective. Using a third-person perspective requires the observed character to be modeled and potentially animated. Whether a user wants to or not, these things will most likely affect the users thoughts on how the character should behave. It is also more difficult to spot detailed movement and rotation from a third-person perspective. In first-person perspective however, a user does not need to know or

UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces

UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces UT^2: Human-like Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces Jacob Schrum, Igor Karpov, and Risto Miikkulainen {schrum2,ikarpov,risto}@cs.utexas.edu Our Approach: UT^2 Evolve

More information

Hierarchical Controller for Robotic Soccer

Hierarchical Controller for Robotic Soccer Hierarchical Controller for Robotic Soccer Byron Knoll Cognitive Systems 402 April 13, 2008 ABSTRACT RoboCup is an initiative aimed at advancing Artificial Intelligence (AI) and robotics research. This

More information

Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function

Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function Davis Ancona and Jake Weiner Abstract In this report, we examine the plausibility of implementing a NEAT-based solution

More information

USING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER

USING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER World Automation Congress 21 TSI Press. USING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER Department of Computer Science Connecticut College New London, CT {ahubley,

More information

Learning and Using Models of Kicking Motions for Legged Robots

Learning and Using Models of Kicking Motions for Legged Robots Learning and Using Models of Kicking Motions for Legged Robots Sonia Chernova and Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 {soniac, mmv}@cs.cmu.edu Abstract

More information

CS 354R: Computer Game Technology

CS 354R: Computer Game Technology CS 354R: Computer Game Technology Introduction to Game AI Fall 2018 What does the A stand for? 2 What is AI? AI is the control of every non-human entity in a game The other cars in a car game The opponents

More information

AGENT PLATFORM FOR ROBOT CONTROL IN REAL-TIME DYNAMIC ENVIRONMENTS. Nuno Sousa Eugénio Oliveira

AGENT PLATFORM FOR ROBOT CONTROL IN REAL-TIME DYNAMIC ENVIRONMENTS. Nuno Sousa Eugénio Oliveira AGENT PLATFORM FOR ROBOT CONTROL IN REAL-TIME DYNAMIC ENVIRONMENTS Nuno Sousa Eugénio Oliveira Faculdade de Egenharia da Universidade do Porto, Portugal Abstract: This paper describes a platform that enables

More information

Reinforcement Learning in Games Autonomous Learning Systems Seminar

Reinforcement Learning in Games Autonomous Learning Systems Seminar Reinforcement Learning in Games Autonomous Learning Systems Seminar Matthias Zöllner Intelligent Autonomous Systems TU-Darmstadt zoellner@rbg.informatik.tu-darmstadt.de Betreuer: Gerhard Neumann Abstract

More information

TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS

TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS Thong B. Trinh, Anwer S. Bashi, Nikhil Deshpande Department of Electrical Engineering University of New Orleans New Orleans, LA 70148 Tel: (504) 280-7383 Fax:

More information

UNIT VI. Current approaches to programming are classified as into two major categories:

UNIT VI. Current approaches to programming are classified as into two major categories: Unit VI 1 UNIT VI ROBOT PROGRAMMING A robot program may be defined as a path in space to be followed by the manipulator, combined with the peripheral actions that support the work cycle. Peripheral actions

More information

Neural Networks for Real-time Pathfinding in Computer Games

Neural Networks for Real-time Pathfinding in Computer Games Neural Networks for Real-time Pathfinding in Computer Games Ross Graham 1, Hugh McCabe 1 & Stephen Sheridan 1 1 School of Informatics and Engineering, Institute of Technology at Blanchardstown, Dublin

More information

situation where it is shot from behind. As a result, ICE is designed to jump in the former case and occasionally look back in the latter situation.

situation where it is shot from behind. As a result, ICE is designed to jump in the former case and occasionally look back in the latter situation. Implementation of a Human-Like Bot in a First Person Shooter: Second Place Bot at BotPrize 2008 Daichi Hirono 1 and Ruck Thawonmas 1 1 Graduate School of Science and Engineering, Ritsumeikan University,

More information

Dynamic Scripting Applied to a First-Person Shooter

Dynamic Scripting Applied to a First-Person Shooter Dynamic Scripting Applied to a First-Person Shooter Daniel Policarpo, Paulo Urbano Laboratório de Modelação de Agentes FCUL Lisboa, Portugal policarpodan@gmail.com, pub@di.fc.ul.pt Tiago Loureiro vectrlab

More information

Biologically Inspired Embodied Evolution of Survival

Biologically Inspired Embodied Evolution of Survival Biologically Inspired Embodied Evolution of Survival Stefan Elfwing 1,2 Eiji Uchibe 2 Kenji Doya 2 Henrik I. Christensen 1 1 Centre for Autonomous Systems, Numerical Analysis and Computer Science, Royal

More information

Game Artificial Intelligence ( CS 4731/7632 )

Game Artificial Intelligence ( CS 4731/7632 ) Game Artificial Intelligence ( CS 4731/7632 ) Instructor: Stephen Lee-Urban http://www.cc.gatech.edu/~surban6/2018-gameai/ (soon) Piazza T-square What s this all about? Industry standard approaches to

More information

USING VALUE ITERATION TO SOLVE SEQUENTIAL DECISION PROBLEMS IN GAMES

USING VALUE ITERATION TO SOLVE SEQUENTIAL DECISION PROBLEMS IN GAMES USING VALUE ITERATION TO SOLVE SEQUENTIAL DECISION PROBLEMS IN GAMES Thomas Hartley, Quasim Mehdi, Norman Gough The Research Institute in Advanced Technologies (RIATec) School of Computing and Information

More information

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi Learning to Play like an Othello Master CS 229 Project Report December 13, 213 1 Abstract This project aims to train a machine to strategically play the game of Othello using machine learning. Prior to

More information

Outline. Agents and environments Rationality PEAS (Performance measure, Environment, Actuators, Sensors) Environment types Agent types

Outline. Agents and environments Rationality PEAS (Performance measure, Environment, Actuators, Sensors) Environment types Agent types Intelligent Agents Outline Agents and environments Rationality PEAS (Performance measure, Environment, Actuators, Sensors) Environment types Agent types Agents An agent is anything that can be viewed as

More information

City Research Online. Permanent City Research Online URL:

City Research Online. Permanent City Research Online URL: Child, C. H. T. & Trusler, B. P. (2014). Implementing Racing AI using Q-Learning and Steering Behaviours. Paper presented at the GAMEON 2014 (15th annual European Conference on Simulation and AI in Computer

More information

Artificial Intelligence for Games

Artificial Intelligence for Games Artificial Intelligence for Games CSC404: Video Game Design Elias Adum Let s talk about AI Artificial Intelligence AI is the field of creating intelligent behaviour in machines. Intelligence understood

More information

Case-based Action Planning in a First Person Scenario Game

Case-based Action Planning in a First Person Scenario Game Case-based Action Planning in a First Person Scenario Game Pascal Reuss 1,2 and Jannis Hillmann 1 and Sebastian Viefhaus 1 and Klaus-Dieter Althoff 1,2 reusspa@uni-hildesheim.de basti.viefhaus@gmail.com

More information

IMPROVING TOWER DEFENSE GAME AI (DIFFERENTIAL EVOLUTION VS EVOLUTIONARY PROGRAMMING) CHEAH KEEI YUAN

IMPROVING TOWER DEFENSE GAME AI (DIFFERENTIAL EVOLUTION VS EVOLUTIONARY PROGRAMMING) CHEAH KEEI YUAN IMPROVING TOWER DEFENSE GAME AI (DIFFERENTIAL EVOLUTION VS EVOLUTIONARY PROGRAMMING) CHEAH KEEI YUAN FACULTY OF COMPUTING AND INFORMATICS UNIVERSITY MALAYSIA SABAH 2014 ABSTRACT The use of Artificial Intelligence

More information

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( )

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( ) COMP3211 Project Artificial Intelligence for Tron game Group 7 Chiu Ka Wa (20369737) Chun Wai Wong (20265022) Ku Chun Kit (20123470) Abstract Tron is an old and popular game based on a movie of the same

More information

Capturing and Adapting Traces for Character Control in Computer Role Playing Games

Capturing and Adapting Traces for Character Control in Computer Role Playing Games Capturing and Adapting Traces for Character Control in Computer Role Playing Games Jonathan Rubin and Ashwin Ram Palo Alto Research Center 3333 Coyote Hill Road, Palo Alto, CA 94304 USA Jonathan.Rubin@parc.com,

More information

Learning and Using Models of Kicking Motions for Legged Robots

Learning and Using Models of Kicking Motions for Legged Robots Learning and Using Models of Kicking Motions for Legged Robots Sonia Chernova and Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 {soniac, mmv}@cs.cmu.edu Abstract

More information

CS221 Project Final Report Automatic Flappy Bird Player

CS221 Project Final Report Automatic Flappy Bird Player 1 CS221 Project Final Report Automatic Flappy Bird Player Minh-An Quinn, Guilherme Reis Introduction Flappy Bird is a notoriously difficult and addicting game - so much so that its creator even removed

More information

Classroom Konnect. Artificial Intelligence and Machine Learning

Classroom Konnect. Artificial Intelligence and Machine Learning Artificial Intelligence and Machine Learning 1. What is Machine Learning (ML)? The general idea about Machine Learning (ML) can be traced back to 1959 with the approach proposed by Arthur Samuel, one of

More information

Individual Test Item Specifications

Individual Test Item Specifications Individual Test Item Specifications 8208110 Game and Simulation Foundations 2015 The contents of this document were developed under a grant from the United States Department of Education. However, the

More information

CS 480: GAME AI TACTIC AND STRATEGY. 5/15/2012 Santiago Ontañón

CS 480: GAME AI TACTIC AND STRATEGY. 5/15/2012 Santiago Ontañón CS 480: GAME AI TACTIC AND STRATEGY 5/15/2012 Santiago Ontañón santi@cs.drexel.edu https://www.cs.drexel.edu/~santi/teaching/2012/cs480/intro.html Reminders Check BBVista site for the course regularly

More information

CRYPTOSHOOTER MULTI AGENT BASED SECRET COMMUNICATION IN AUGMENTED VIRTUALITY

CRYPTOSHOOTER MULTI AGENT BASED SECRET COMMUNICATION IN AUGMENTED VIRTUALITY CRYPTOSHOOTER MULTI AGENT BASED SECRET COMMUNICATION IN AUGMENTED VIRTUALITY Submitted By: Sahil Narang, Sarah J Andrabi PROJECT IDEA The main idea for the project is to create a pursuit and evade crowd

More information

Confidence-Based Multi-Robot Learning from Demonstration

Confidence-Based Multi-Robot Learning from Demonstration Int J Soc Robot (2010) 2: 195 215 DOI 10.1007/s12369-010-0060-0 Confidence-Based Multi-Robot Learning from Demonstration Sonia Chernova Manuela Veloso Accepted: 5 May 2010 / Published online: 19 May 2010

More information

Artificial Intelligence ( CS 365 ) IMPLEMENTATION OF AI SCRIPT GENERATOR USING DYNAMIC SCRIPTING FOR AOE2 GAME

Artificial Intelligence ( CS 365 ) IMPLEMENTATION OF AI SCRIPT GENERATOR USING DYNAMIC SCRIPTING FOR AOE2 GAME Artificial Intelligence ( CS 365 ) IMPLEMENTATION OF AI SCRIPT GENERATOR USING DYNAMIC SCRIPTING FOR AOE2 GAME Author: Saurabh Chatterjee Guided by: Dr. Amitabha Mukherjee Abstract: I have implemented

More information

Reinforcement Learning Agent for Scrolling Shooter Game

Reinforcement Learning Agent for Scrolling Shooter Game Reinforcement Learning Agent for Scrolling Shooter Game Peng Yuan (pengy@stanford.edu) Yangxin Zhong (yangxin@stanford.edu) Zibo Gong (zibo@stanford.edu) 1 Introduction and Task Definition 1.1 Game Agent

More information

Dipartimento di Elettronica Informazione e Bioingegneria Robotics

Dipartimento di Elettronica Informazione e Bioingegneria Robotics Dipartimento di Elettronica Informazione e Bioingegneria Robotics Behavioral robotics @ 2014 Behaviorism behave is what organisms do Behaviorism is built on this assumption, and its goal is to promote

More information

Making Simple Decisions CS3523 AI for Computer Games The University of Aberdeen

Making Simple Decisions CS3523 AI for Computer Games The University of Aberdeen Making Simple Decisions CS3523 AI for Computer Games The University of Aberdeen Contents Decision making Search and Optimization Decision Trees State Machines Motivating Question How can we program rules

More information

Experiments with Learning for NPCs in 2D shooter

Experiments with Learning for NPCs in 2D shooter 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Case-Based Goal Formulation

Case-Based Goal Formulation Case-Based Goal Formulation Ben G. Weber and Michael Mateas and Arnav Jhala Expressive Intelligence Studio University of California, Santa Cruz {bweber, michaelm, jhala}@soe.ucsc.edu Abstract Robust AI

More information

Creating a Poker Playing Program Using Evolutionary Computation

Creating a Poker Playing Program Using Evolutionary Computation Creating a Poker Playing Program Using Evolutionary Computation Simon Olsen and Rob LeGrand, Ph.D. Abstract Artificial intelligence is a rapidly expanding technology. We are surrounded by technology that

More information

Evolving High-Dimensional, Adaptive Camera-Based Speed Sensors

Evolving High-Dimensional, Adaptive Camera-Based Speed Sensors In: M.H. Hamza (ed.), Proceedings of the 21st IASTED Conference on Applied Informatics, pp. 1278-128. Held February, 1-1, 2, Insbruck, Austria Evolving High-Dimensional, Adaptive Camera-Based Speed Sensors

More information

MarineBlue: A Low-Cost Chess Robot

MarineBlue: A Low-Cost Chess Robot MarineBlue: A Low-Cost Chess Robot David URTING and Yolande BERBERS {David.Urting, Yolande.Berbers}@cs.kuleuven.ac.be KULeuven, Department of Computer Science Celestijnenlaan 200A, B-3001 LEUVEN Belgium

More information

Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER April 6, 2017

Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER April 6, 2017 Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER 2017 April 6, 2017 Upcoming Misc. Check out course webpage and schedule Check out Canvas, especially for deadlines Do the survey by tomorrow,

More information

Last Time: Acting Humanly: The Full Turing Test

Last Time: Acting Humanly: The Full Turing Test Last Time: Acting Humanly: The Full Turing Test Alan Turing's 1950 article Computing Machinery and Intelligence discussed conditions for considering a machine to be intelligent Can machines think? Can

More information

Designing AI for Competitive Games. Bruce Hayles & Derek Neal

Designing AI for Competitive Games. Bruce Hayles & Derek Neal Designing AI for Competitive Games Bruce Hayles & Derek Neal Introduction Meet the Speakers Derek Neal Bruce Hayles @brucehayles Director of Production Software Engineer The Problem Same Old Song New User

More information

an AI for Slither.io

an AI for Slither.io an AI for Slither.io Jackie Yang(jackiey) Introduction Game playing is a very interesting topic area in Artificial Intelligence today. Most of the recent emerging AI are for turn-based game, like the very

More information

Advanced Techniques for Mobile Robotics Location-Based Activity Recognition

Advanced Techniques for Mobile Robotics Location-Based Activity Recognition Advanced Techniques for Mobile Robotics Location-Based Activity Recognition Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Activity Recognition Based on L. Liao, D. J. Patterson, D. Fox,

More information

Universiteit Leiden Opleiding Informatica

Universiteit Leiden Opleiding Informatica Universiteit Leiden Opleiding Informatica Predicting the Outcome of the Game Othello Name: Simone Cammel Date: August 31, 2015 1st supervisor: 2nd supervisor: Walter Kosters Jeannette de Graaf BACHELOR

More information

In the end, the code and tips in this document could be used to create any type of camera.

In the end, the code and tips in this document could be used to create any type of camera. Overview The Adventure Camera & Rig is a multi-behavior camera built specifically for quality 3 rd Person Action/Adventure games. Use it as a basis for your custom camera system or out-of-the-box to kick

More information

AUTOMATED MUSIC TRACK GENERATION

AUTOMATED MUSIC TRACK GENERATION AUTOMATED MUSIC TRACK GENERATION LOUIS EUGENE Stanford University leugene@stanford.edu GUILLAUME ROSTAING Stanford University rostaing@stanford.edu Abstract: This paper aims at presenting our method to

More information

VICs: A Modular Vision-Based HCI Framework

VICs: A Modular Vision-Based HCI Framework VICs: A Modular Vision-Based HCI Framework The Visual Interaction Cues Project Guangqi Ye, Jason Corso Darius Burschka, & Greg Hager CIRL, 1 Today, I ll be presenting work that is part of an ongoing project

More information

Sokoban: Reversed Solving

Sokoban: Reversed Solving Sokoban: Reversed Solving Frank Takes (ftakes@liacs.nl) Leiden Institute of Advanced Computer Science (LIACS), Leiden University June 20, 2008 Abstract This article describes a new method for attempting

More information

Deep Green. System for real-time tracking and playing the board game Reversi. Final Project Submitted by: Nadav Erell

Deep Green. System for real-time tracking and playing the board game Reversi. Final Project Submitted by: Nadav Erell Deep Green System for real-time tracking and playing the board game Reversi Final Project Submitted by: Nadav Erell Introduction to Computational and Biological Vision Department of Computer Science, Ben-Gurion

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Lecture 01 - Introduction Edirlei Soares de Lima What is Artificial Intelligence? Artificial intelligence is about making computers able to perform the

More information

Virtual Grasping Using a Data Glove

Virtual Grasping Using a Data Glove Virtual Grasping Using a Data Glove By: Rachel Smith Supervised By: Dr. Kay Robbins 3/25/2005 University of Texas at San Antonio Motivation Navigation in 3D worlds is awkward using traditional mouse Direct

More information

Robot Autonomous and Autonomy. By Noah Gleason and Eli Barnett

Robot Autonomous and Autonomy. By Noah Gleason and Eli Barnett Robot Autonomous and Autonomy By Noah Gleason and Eli Barnett Summary What do we do in autonomous? (Overview) Approaches to autonomous No feedback Drive-for-time Feedback Drive-for-distance Drive, turn,

More information

CS7032: AI & Agents: Ms Pac-Man vs Ghost League - AI controller project

CS7032: AI & Agents: Ms Pac-Man vs Ghost League - AI controller project CS7032: AI & Agents: Ms Pac-Man vs Ghost League - AI controller project TIMOTHY COSTIGAN 12263056 Trinity College Dublin This report discusses various approaches to implementing an AI for the Ms Pac-Man

More information

Evolving Behaviour Trees for the Commercial Game DEFCON

Evolving Behaviour Trees for the Commercial Game DEFCON Evolving Behaviour Trees for the Commercial Game DEFCON Chong-U Lim, Robin Baumgarten and Simon Colton Computational Creativity Group Department of Computing, Imperial College, London www.doc.ic.ac.uk/ccg

More information

Swing Copters AI. Monisha White and Nolan Walsh Fall 2015, CS229, Stanford University

Swing Copters AI. Monisha White and Nolan Walsh  Fall 2015, CS229, Stanford University Swing Copters AI Monisha White and Nolan Walsh mewhite@stanford.edu njwalsh@stanford.edu Fall 2015, CS229, Stanford University 1. Introduction For our project we created an autonomous player for the game

More information

BODILY NON-VERBAL INTERACTION WITH VIRTUAL CHARACTERS

BODILY NON-VERBAL INTERACTION WITH VIRTUAL CHARACTERS KEER2010, PARIS MARCH 2-4 2010 INTERNATIONAL CONFERENCE ON KANSEI ENGINEERING AND EMOTION RESEARCH 2010 BODILY NON-VERBAL INTERACTION WITH VIRTUAL CHARACTERS Marco GILLIES *a a Department of Computing,

More information

CandyCrush.ai: An AI Agent for Candy Crush

CandyCrush.ai: An AI Agent for Candy Crush CandyCrush.ai: An AI Agent for Candy Crush Jiwoo Lee, Niranjan Balachandar, Karan Singhal December 16, 2016 1 Introduction Candy Crush, a mobile puzzle game, has become very popular in the past few years.

More information

Olympiad Combinatorics. Pranav A. Sriram

Olympiad Combinatorics. Pranav A. Sriram Olympiad Combinatorics Pranav A. Sriram August 2014 Chapter 2: Algorithms - Part II 1 Copyright notices All USAMO and USA Team Selection Test problems in this chapter are copyrighted by the Mathematical

More information

Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation

Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation Distributed Vision System: A Perceptual Information Infrastructure for Robot Navigation Hiroshi Ishiguro Department of Information Science, Kyoto University Sakyo-ku, Kyoto 606-01, Japan E-mail: ishiguro@kuis.kyoto-u.ac.jp

More information

Efficiency and Effectiveness of Game AI

Efficiency and Effectiveness of Game AI Efficiency and Effectiveness of Game AI Bob van der Putten and Arno Kamphuis Center for Advanced Gaming and Simulation, Utrecht University Padualaan 14, 3584 CH Utrecht, The Netherlands Abstract In this

More information

AI in Computer Games. AI in Computer Games. Goals. Game A(I?) History Game categories

AI in Computer Games. AI in Computer Games. Goals. Game A(I?) History Game categories AI in Computer Games why, where and how AI in Computer Games Goals Game categories History Common issues and methods Issues in various game categories Goals Games are entertainment! Important that things

More information

Workshop 4: Digital Media By Daniel Crippa

Workshop 4: Digital Media By Daniel Crippa Topics Covered Workshop 4: Digital Media Workshop 4: Digital Media By Daniel Crippa 13/08/2018 Introduction to the Unity Engine Components (Rigidbodies, Colliders, etc.) Prefabs UI Tilemaps Game Design

More information

CS295-1 Final Project : AIBO

CS295-1 Final Project : AIBO CS295-1 Final Project : AIBO Mert Akdere, Ethan F. Leland December 20, 2005 Abstract This document is the final report for our CS295-1 Sensor Data Management Course Final Project: Project AIBO. The main

More information

Achieving Desirable Gameplay Objectives by Niched Evolution of Game Parameters

Achieving Desirable Gameplay Objectives by Niched Evolution of Game Parameters Achieving Desirable Gameplay Objectives by Niched Evolution of Game Parameters Scott Watson, Andrew Vardy, Wolfgang Banzhaf Department of Computer Science Memorial University of Newfoundland St John s.

More information

Strategic and Tactical Reasoning with Waypoints Lars Lidén Valve Software

Strategic and Tactical Reasoning with Waypoints Lars Lidén Valve Software Strategic and Tactical Reasoning with Waypoints Lars Lidén Valve Software lars@valvesoftware.com For the behavior of computer controlled characters to become more sophisticated, efficient algorithms are

More information

The purpose of this document is to help users create their own TimeSplitters Future Perfect maps. It is designed as a brief overview for beginners.

The purpose of this document is to help users create their own TimeSplitters Future Perfect maps. It is designed as a brief overview for beginners. MAP MAKER GUIDE 2005 Free Radical Design Ltd. "TimeSplitters", "TimeSplitters Future Perfect", "Free Radical Design" and all associated logos are trademarks of Free Radical Design Ltd. All rights reserved.

More information

Autocomplete Sketch Tool

Autocomplete Sketch Tool Autocomplete Sketch Tool Sam Seifert, Georgia Institute of Technology Advanced Computer Vision Spring 2016 I. ABSTRACT This work details an application that can be used for sketch auto-completion. Sketch

More information

Who am I? AI in Computer Games. Goals. AI in Computer Games. History Game A(I?)

Who am I? AI in Computer Games. Goals. AI in Computer Games. History Game A(I?) Who am I? AI in Computer Games why, where and how Lecturer at Uppsala University, Dept. of information technology AI, machine learning and natural computation Gamer since 1980 Olle Gällmo AI in Computer

More information

Case-Based Goal Formulation

Case-Based Goal Formulation Case-Based Goal Formulation Ben G. Weber and Michael Mateas and Arnav Jhala Expressive Intelligence Studio University of California, Santa Cruz {bweber, michaelm, jhala}@soe.ucsc.edu Abstract Robust AI

More information

IMGD 1001: Programming Practices; Artificial Intelligence

IMGD 1001: Programming Practices; Artificial Intelligence IMGD 1001: Programming Practices; Artificial Intelligence Robert W. Lindeman Associate Professor Department of Computer Science Worcester Polytechnic Institute gogo@wpi.edu Outline Common Practices Artificial

More information

Thesis Project - CS297 Fall David Robert Smith

Thesis Project - CS297 Fall David Robert Smith Introduction The purpose of my thesis project is to design an algorithm for taking a film script and systematically generating a shot list. On typical motion picture productions, creating a shot list is

More information

CMSC 671 Project Report- Google AI Challenge: Planet Wars

CMSC 671 Project Report- Google AI Challenge: Planet Wars 1. Introduction Purpose The purpose of the project is to apply relevant AI techniques learned during the course with a view to develop an intelligent game playing bot for the game of Planet Wars. Planet

More information

Artificial Intelligence (AI) Artificial Intelligence Part I. Intelligence (wikipedia) AI (wikipedia) ! What is intelligence?

Artificial Intelligence (AI) Artificial Intelligence Part I. Intelligence (wikipedia) AI (wikipedia) ! What is intelligence? (AI) Part I! What is intelligence?! What is artificial intelligence? Nathan Sturtevant UofA CMPUT 299 Winter 2007 February 15, 2006 Intelligence (wikipedia)! Intelligence is usually said to involve mental

More information

Empirical evaluation of procedural level generators for 2D platform games

Empirical evaluation of procedural level generators for 2D platform games Thesis no: MSCS-2014-02 Empirical evaluation of procedural level generators for 2D platform games Robert Hoeft Agnieszka Nieznańska Faculty of Computing Blekinge Institute of Technology SE-371 79 Karlskrona

More information

Soccer-Swarm: A Visualization Framework for the Development of Robot Soccer Players

Soccer-Swarm: A Visualization Framework for the Development of Robot Soccer Players Soccer-Swarm: A Visualization Framework for the Development of Robot Soccer Players Lorin Hochstein, Sorin Lerner, James J. Clark, and Jeremy Cooperstock Centre for Intelligent Machines Department of Computer

More information

IMGD 1001: Programming Practices; Artificial Intelligence

IMGD 1001: Programming Practices; Artificial Intelligence IMGD 1001: Programming Practices; Artificial Intelligence by Mark Claypool (claypool@cs.wpi.edu) Robert W. Lindeman (gogo@wpi.edu) Outline Common Practices Artificial Intelligence Claypool and Lindeman,

More information

RISTO MIIKKULAINEN, SENTIENT (HTTP://VENTUREBEAT.COM/AUTHOR/RISTO-MIIKKULAINEN- SATIENT/) APRIL 3, :23 PM

RISTO MIIKKULAINEN, SENTIENT (HTTP://VENTUREBEAT.COM/AUTHOR/RISTO-MIIKKULAINEN- SATIENT/) APRIL 3, :23 PM 1,2 Guest Machines are becoming more creative than humans RISTO MIIKKULAINEN, SENTIENT (HTTP://VENTUREBEAT.COM/AUTHOR/RISTO-MIIKKULAINEN- SATIENT/) APRIL 3, 2016 12:23 PM TAGS: ARTIFICIAL INTELLIGENCE

More information

Overview. The Game Idea

Overview. The Game Idea Page 1 of 19 Overview Even though GameMaker:Studio is easy to use, getting the hang of it can be a bit difficult at first, especially if you have had no prior experience of programming. This tutorial is

More information

Human or Robot? Robert Recatto A University of California, San Diego 9500 Gilman Dr. La Jolla CA,

Human or Robot? Robert Recatto A University of California, San Diego 9500 Gilman Dr. La Jolla CA, Human or Robot? INTRODUCTION: With advancements in technology happening every day and Artificial Intelligence becoming more integrated into everyday society the line between human intelligence and computer

More information

MULTI-LAYERED HYBRID ARCHITECTURE TO SOLVE COMPLEX TASKS OF AN AUTONOMOUS MOBILE ROBOT

MULTI-LAYERED HYBRID ARCHITECTURE TO SOLVE COMPLEX TASKS OF AN AUTONOMOUS MOBILE ROBOT MULTI-LAYERED HYBRID ARCHITECTURE TO SOLVE COMPLEX TASKS OF AN AUTONOMOUS MOBILE ROBOT F. TIECHE, C. FACCHINETTI and H. HUGLI Institute of Microtechnology, University of Neuchâtel, Rue de Tivoli 28, CH-2003

More information

Interface in Games. UNM Spring Topics in Game Development ECE 495/595; CS 491/591

Interface in Games. UNM Spring Topics in Game Development ECE 495/595; CS 491/591 Interface in Games Topics in Game Development UNM Spring 2008 ECE 495/595; CS 491/591 User Interface (UI) is: The connection between game & player How player receives information How player takes action

More information

Federico Forti, Erdi Izgi, Varalika Rathore, Francesco Forti

Federico Forti, Erdi Izgi, Varalika Rathore, Francesco Forti Basic Information Project Name Supervisor Kung-fu Plants Jakub Gemrot Annotation Kung-fu plants is a game where you can create your characters, train them and fight against the other chemical plants which

More information

Artificial Neural Network based Mobile Robot Navigation

Artificial Neural Network based Mobile Robot Navigation Artificial Neural Network based Mobile Robot Navigation István Engedy Budapest University of Technology and Economics, Department of Measurement and Information Systems, Magyar tudósok körútja 2. H-1117,

More information

Creating Computer Games

Creating Computer Games By the end of this task I should know how to... 1) import graphics (background and sprites) into Scratch 2) make sprites move around the stage 3) create a scoring system using a variable. Creating Computer

More information

the question of whether computers can think is like the question of whether submarines can swim -- Dijkstra

the question of whether computers can think is like the question of whether submarines can swim -- Dijkstra the question of whether computers can think is like the question of whether submarines can swim -- Dijkstra Game AI: The set of algorithms, representations, tools, and tricks that support the creation

More information

Integrating Learning in a Multi-Scale Agent

Integrating Learning in a Multi-Scale Agent Integrating Learning in a Multi-Scale Agent Ben Weber Dissertation Defense May 18, 2012 Introduction AI has a long history of using games to advance the state of the field [Shannon 1950] Real-Time Strategy

More information

Saphira Robot Control Architecture

Saphira Robot Control Architecture Saphira Robot Control Architecture Saphira Version 8.1.0 Kurt Konolige SRI International April, 2002 Copyright 2002 Kurt Konolige SRI International, Menlo Park, California 1 Saphira and Aria System Overview

More information

Research Seminar. Stefano CARRINO fr.ch

Research Seminar. Stefano CARRINO  fr.ch Research Seminar Stefano CARRINO stefano.carrino@hefr.ch http://aramis.project.eia- fr.ch 26.03.2010 - based interaction Characterization Recognition Typical approach Design challenges, advantages, drawbacks

More information

Game Theory: The Basics. Theory of Games and Economics Behavior John Von Neumann and Oskar Morgenstern (1943)

Game Theory: The Basics. Theory of Games and Economics Behavior John Von Neumann and Oskar Morgenstern (1943) Game Theory: The Basics The following is based on Games of Strategy, Dixit and Skeath, 1999. Topic 8 Game Theory Page 1 Theory of Games and Economics Behavior John Von Neumann and Oskar Morgenstern (1943)

More information

Plan for the 2nd hour. What is AI. Acting humanly: The Turing test. EDAF70: Applied Artificial Intelligence Agents (Chapter 2 of AIMA)

Plan for the 2nd hour. What is AI. Acting humanly: The Turing test. EDAF70: Applied Artificial Intelligence Agents (Chapter 2 of AIMA) Plan for the 2nd hour EDAF70: Applied Artificial Intelligence (Chapter 2 of AIMA) Jacek Malec Dept. of Computer Science, Lund University, Sweden January 17th, 2018 What is an agent? PEAS (Performance measure,

More information

Monte Carlo based battleship agent

Monte Carlo based battleship agent Monte Carlo based battleship agent Written by: Omer Haber, 313302010; Dror Sharf, 315357319 Introduction The game of battleship is a guessing game for two players which has been around for almost a century.

More information

Game Mechanics Minesweeper is a game in which the player must correctly deduce the positions of

Game Mechanics Minesweeper is a game in which the player must correctly deduce the positions of Table of Contents Game Mechanics...2 Game Play...3 Game Strategy...4 Truth...4 Contrapositive... 5 Exhaustion...6 Burnout...8 Game Difficulty... 10 Experiment One... 12 Experiment Two...14 Experiment Three...16

More information

ConvNets and Forward Modeling for StarCraft AI

ConvNets and Forward Modeling for StarCraft AI ConvNets and Forward Modeling for StarCraft AI Alex Auvolat September 15, 2016 ConvNets and Forward Modeling for StarCraft AI 1 / 20 Overview ConvNets and Forward Modeling for StarCraft AI 2 / 20 Section

More information

IMGD 1001: Fun and Games

IMGD 1001: Fun and Games IMGD 1001: Fun and Games Robert W. Lindeman Associate Professor Department of Computer Science Worcester Polytechnic Institute gogo@wpi.edu Outline What is a Game? Genres What Makes a Good Game? 2 What

More information

Glossary of terms. Short explanation

Glossary of terms. Short explanation Glossary Concept Module. Video Short explanation Abstraction 2.4 Capturing the essence of the behavior of interest (getting a model or representation) Action in the control Derivative 4.2 The control signal

More information

Multi-Robot Coordination. Chapter 11

Multi-Robot Coordination. Chapter 11 Multi-Robot Coordination Chapter 11 Objectives To understand some of the problems being studied with multiple robots To understand the challenges involved with coordinating robots To investigate a simple

More information

Multi-Platform Soccer Robot Development System

Multi-Platform Soccer Robot Development System Multi-Platform Soccer Robot Development System Hui Wang, Han Wang, Chunmiao Wang, William Y. C. Soh Division of Control & Instrumentation, School of EEE Nanyang Technological University Nanyang Avenue,

More information

Software Development of the Board Game Agricola

Software Development of the Board Game Agricola CARLETON UNIVERSITY Software Development of the Board Game Agricola COMP4905 Computer Science Honours Project Robert Souter Jean-Pierre Corriveau Ph.D., Associate Professor, School of Computer Science

More information