Multi-Agent Simulation & Kinect Game

Size: px

Start display at page:

Download "Multi-Agent Simulation & Kinect Game"

Frederick Osborne
5 years ago
Views:

1 Multi-Agent Simulation & Kinect Game Actual Intelligence Eric Clymer Beth Neilsen Jake Piccolo Geoffry Sumter

2 Abstract This study aims to compare the effectiveness of a greedy multi-agent system to the effectiveness of random agent movement to maximize resources. To do this, a Microsoft Kinect game was created where multiple autonomous agents attempt to maximize their energy resources by attacking a human player. The agents in the game lose energy when moving and attempt to greedily maximize their total energy by either resting when no body point is nearby or by attacking and destroying a user s body points to gain energy. In order to make this experiment into a fun game, the player attempts to avoid the agents for as long as possible for a high score. We measured the effects of making agents that are capable of "seeing" these body points and measured the effects of introducing the ability to communicate among agents, by testing three generations of increasingly complex agents. We found that agents who could self-direct and move to a player's body points based on what it sees performed significantly better than agents that randomly moved about. However, communication between agents to alert one another to sighted body points did not greatly improve agent performance. We believe this was largely due to limitations in what type of information the agents communicate. For improvements in future studies, we recommend expanding the capabilities of agent communication. 2

3 1. Introduction 1.1 Problem We will strive to identify the benefits from using a greedy multi-agent approach compared to a random approach to attack. The two main problems analyzed are the benefits of the greedy multi-agent approach and the negative aspects of the random walk approach. Generally, the random walk approach is slower, requires more resources to complete and is uncoordinated between agents. The greedy approach provides a faster more distributed and more responsive result. This analysis is important because it demonstrates how a greedy multi-agent system can be used in a real-time system, with distributed knowledge and resources, to allocate and distribute those resources not necessarily in the best way, but in a way that is good enough and much better overall than the random walk approach. 1.2 Proposed Solution Approach Our application is a multi-agent attack survival game. The object of the game is to avoid the agents. The player would not be eliminated immediately if they fail to avoid touching an agent, but lose that point on their body. We will integrate this analysis with a Kinect game to increase interest and make it easier to demonstrate our results. The game continues until the player has lost all of the points identified by the Kinect. The player will stand in front of the Microsoft Kinect and be able to see themselves on a computer screen or on a projector screen. The agents will appear on the screen when the game begins and will roam the environment in different ways depending on which generation of agent is selected. Ideally, the game will be so simple to setup that the user will not need any training to play. They should be able to simply push a button and begin playing the game. The agents will attempt to attack the user and destroy all of the body points as quickly as possible with the most benefit to the agent in regards to energy. The agents will lose a small amount of energy for each pixel they move across the screen. They will gain energy for every frame they remain in the same position, and they will have a large increase in energy when they destroy a body point. This application will provide users with a fun game they will be able to pick up and immediately begin to play with very little, if any, instruction. This game also promotes exercise by forcing the player to actually move in order to play. This game is also educational by providing users with insights regarding the benefits of intelligent agents and multi-agent systems. From an artificial intelligence perspective, this game is valuable because it will easily demonstrate the advantages of multi-agent systems by allowing users to compare the first generation of agents (without intelligent agent properties) to the other agents (with intelligent agent properties). The effect of changes to the agent properties can also be evaluated by using the same generation of agent and making changes to the agent s attributes (speed and vision distance). This game will allow users to easily visualize the effects of adjusting the communication abilities of the agents and its resulting effects on the agent s group cohesiveness. The production of a visual game-like simulation is feasible with the Kinect and our knowledge of greedy multi-agent systems. 1.3 Contributions One main contribution we provide is a game-like application that can be used to identify the benefits of vision and communication among agents in an attack simulation. The application can also be used to identify how changes to these variables affect the overall effectiveness of the agents. A user can easily 3

4 switch between generations of agents and adjust agent attribute variables. Another contribution we provide is an analysis and identified implications from how the generations of agents compare to each other when attacking a human game user. We also analyzed how adjustments to communication affect the agent attacks given a specified set of agent attributes. In the next section, Section 2, we discuss related work in the area of gaming artificial intelligence and multi-agent systems. Section 3 Methodology and Design details how the application was designed (both agent and environment) and what methodologies were used to construct our basic assumptions. The implementation section, Section 4, outlines how our design was implemented and how the application was built. Section 5 Hypotheses details our original hypotheses regarding how the different generations of agents would compare and how adjusting the communication distance affects the agent s attack. Section 6 contains the results from all of our experiments that test our hypotheses. Section 7 details the project challenges and how we overcame or avoided them. Possible extensions and future work are described in section 8. Section 9 Conclusions presents all of the major conclusions we have drawn from our work. Section 10, the appendices, contains our source code and instructions on how to run the program using the Kinect. 2. Related Work and Background Academic applications of artificial intelligence, such as neural networks and genetic algorithms are rarely used in game AI due to the high level of computational, tuning, and testing resources. Instead, the field of game AI is primarily focused on Agents, Path Finding, Finite State Machines, Decision Trees, and Influence Mapping (Nayarek, 2002). With few exceptions, games do not strictly follow academic models for AI, favoring shortcuts for performance. AI cheating) is also typically allowed to maximize player enjoyment as long as the characters still seem to not possess special abilities. Agents cheat when they are given special capabilities not available to the player (such as being able to see the entire game map). Despite most games taking this liberty, the team did not utilize such cheating as the project is not simply a game meant for enjoyment. According to Nareyek (2002) AI applications in games typically involve characters, frequently referred to as non-player-characters (NPC s), that may cooperate with or compete with the human players. It is common to view these NPC s as autonomous agents. Agents in games have goals, can sense parts of their environment, and can perform one or more actions. In some agent models, such as multi-agent systems, one of the actions performed by agents include communicating with other agents. Game agents may exhibit the following properties: Resources - When planning their next actions, NPC agents may need to account for limited resources, such as health, energy, or magic points. Therefore, the costs associated with a given action in terms of resources are weighed against the benefits. Incomplete knowledge - To avoid cheating in a game, an NPC agent should only be aware of things its sensors have detected. These sensors may typically be vision or hearing in a game like a first-person shooter, or anything not in the fog of war in a real-time strategy game. Temporal planning - It is important that actions take place in a timely manner in a real-time game. 4

5 Adapting to a changing environment - games typically aren t static environments, so it s important that agents are capable of modifying their actions to changes in the game world or actions of human players or other NPC s. Nareyek (2002) also identifies several different types of agents that may be implemented in games. The simplest of these, reactive agents, can map every single sensory input to a specific output action. This is similar to a jump table or switch statement in a programming language, where every known input is mapped to a specific output or function. In this case, the entire agent behavior can be implemented as simple if-then heuristics. The benefit of this is the fast computation, much like a switch statement. However, the downside of this is that every possible situation must be considered by the programmer in advance. This can lead to forgetting unusual situations, or having an extremely large list of rules that requires a lot of time and tedium to implement. A slightly more complex type of agent is a triggering agent. These agents include internal states, allowing past information to be factored into the rules. This way, longer-term goals are more easily achievable, as the agents can be modeled as finite state machines. Another benefit to this type of agent is that they can be just as fast as purely reactive agents. However, it is still necessary to account for every type of state using triggering agents. Agents used in our game can be considered triggering agents, as they use states and relatively simple heuristics to determine their next course of action. Finite State Machines FSM are often used in games to transition between the various states an AI character can currently be in. After transitioning to a state, different methods are then used for the actual behavior. To minimize complexity, FSM are typically implemented using simple conditionals (Nayarek, 2004). As an example of this, the agents in the game can transition between rest, search, and attack states. To transition between these states the agents evaluate their energy level, vision, and communication information to determine the best course of action (Nayarek, 2004). 3. Methodology and Design 3.1 Basic Assumptions The main artificial intelligence technique that we will employ is a multi-agent system. Agents must perform flexible autonomous actions that act on the environment. For an agent to be flexible, it must demonstrate reactivity, pro-activeness, and social ability. Autonomy The agents will be able to move around in the search space until it either sees or touches a user s body point. If a body point is seen, the agent will move toward the point (Generation I and Generation II) and will communicate (Generation II) to all other nearby agents. If the body point is contacted the point will be destroyed. Pro-Activeness (goal-directed behaviors) Each agent will have the goal of attacking the user. The agents will have a limited amount of energy and will be required to rest or destroy a body point to regain energy. This will provide the agents with a motivation to attack, yet restrict their behavior. This 5

6 will cause the agents to have a decision to make, whether to move and use fuel to potentially receive a reward or to stay put and continue resting. Reactivity The agents will adjust when the body points move. If a body point is seen by an agent then moves and is still within the agent s vision distance, the agent will re-evaluate whether or not it should attack and then act. Social Ability (communication) Initially, each agent will only determine whether or not to attach based solely on its own status. In future sequences, we integrated communication. The agents will be able to communicate with the other nearby agents. Agents will be considered nearby if they are within the designated communication distance. After the agents were able to communicate, we forced the agents to consider the benefit of actions from a group perspective. This requires the agents to value a collective benefit along with its individual benefit gained from different actions. Often the collective s benefit is in conflict with what appears to be the most beneficial for a single agent with no outside perspective. 3.2 Agent Design We plan to create an initial generation of agents, which will be modified and enhanced to increase its effectiveness in the next two generation of agents. Generation I agents will not display any agent attributes. Generation I agents will merely roam the environment randomly. Generation II agents will display individual agent behaviors, but will not utilize any communication or group behaviors. Generation III agents will individually perform similarly to the Generation II agents, but will also communicate and take into consideration what is best for the collective when determining what action to take. Autonomy The default agent behavior is to roam the environment while its energy level is greater than ten percent of its given starting energy level. We chose to base this threshold off of the given initial energy so that the agents will also be able to move initially no matter what the user sets for its initial energy value. Once the agent reaches its lower threshold and needs to rest, it will continue to rest until the energy level has again reached thirty percent of its given starting energy level. Pro-Activeness (goal-directed behaviors) Each Generation II and Generation III agent will exhibit goaldirected behavior, which is demonstrated in Diagram 1 and Diagram 2. The agent s basic goal is to attack the user s body points. The reward an agent receives for destroying a body point is an increase in energy as specified by the user on the Settings panel. The algorithm for determining whether or not an agent should attack a visualized body point is as follows: For the closest visualized body point: Cost of attacking = Energy required for the agent to travel to the visualized body point Benefit of attacking = Energy reward for the agent from destroying a body point Available agent Energy = Agent s current level of energy Probability of Obtaining Reward = (100- distance to body point/agent speed)/100 if ((Benefit of attacking* Probability of Obtaining Reward) > Cost of attacking && Cost of attacking < Available agent energy) attack the visualized body point else rest 6

7 Diagram 1: Generation II Agent Decision Diagram Agent visualizes a point (Generation II) Cost- Benefit Analysis Agent attacks Agent rests The agent will rest until it visualizes a point, then it performs the cost-benefit analysis. If the benefit is greater than the cost, it will attack; otherwise it will go back to resting. Reactivity Once an agent visualizes body points, the closest of these points is designated as that agent s target. The agent will then evaluate whether or not to attack its target based on the cost benefit algorithm discussed previously. If the body point moves and is still within the agent s vision distance, the agent will again determine which body point is closest and set that body point to be its new target and will reevaluate whether or not to attack. If the agent can no longer see the point, it will remove its current target and look to see if it can visualize any other points. The closest of the new points will be set as the new target and will be evaluated. If the agent can no longer see any body points and its target should be within its vision distance, the agent will clear its target because whatever was there before that set the point as the target has moved and there is no longer any reason to move to that point. If the agent chose to pursue a target that was communicated to the agent by another agent, but the agent cannot visualize the point where the target is located, the agent will continue to pursue the target provided the cost-benefit analysis indicates it should or until it gets close enough to visualize that point so that it can determine for itself if a body point is still at the designated target. Social Ability (Communication) There is one communication method between agents evaluated in our work. This will consist of the Generation III agents broadcasting information about the body points it visualizes to every agent within the communication distance, as shown in Diagram 2. To reduce complexity our Generation III agents still make decisions about what body points to attack based on individualistic cost-benefit analysis. The communication increases the agent s attacks by allowing the agents to tell each other where they visualize points. That way if one agent does not see a body point, it can still begin attacking the point closest to itself based on the information it was provided by the agents that could see points. This also will allow every agent to re-evaluate the target it is attacking. If a target point that is communicated to an agent is closer to the agent that its current target, the agent s current target will be removed and the new target will be set and evaluated to determine whether or not the agent should attack. This attacking strategy is more focused on causing the agents to swarm towards the concentrated targets rather than being efficient. A side effect of the swarming is that the agent s will generally find other points to attack because the targets are close together. 7

8 Diagram 2: Generation III Agent Decision Diagram Agent visualizes or is told about a point (Generation III) Cost- Benefit Analysis Agent attacks Agent broadcasts (Generation III) Agent rests The agent will rest until it visualizes a point, broadcast that it has visualized that point to all agents within the communication distance, and then performs the cost-benefit analysis. If the benefit is greater than the cost, it will attack; otherwise it will go back to resting. 3.3 Environment Design The agent s environment will include the Kinect provided frame of reference, the user and his or her body points, and other agents. The environment will change as the user and as the agents move. In regards to the five general environment aspects, this indicates there is an incomplete set of information regarding the environment before the game begins. Throughout the game, the environment can change making it dynamic, not static. Our environment is also stochastic, or random. The initial location of the agent s is always random and the location of the body points depends on the user so is not deterministic either; however, is an agent hits a body point at any time, the body point is destroyed which makes this aspect of the environment deterministic. The environment is episodic between games, which means that each game is independent of every other game and the result of one game does not influence any future scenarios between games. However, within a single game, the environment is non-episodic. The position of the body points and the agents at one point in time is highly related to their positions just previous and just after that point in time. Last, the environment is continuous within a game. There are too many potential states to count. At any point in time the agents and body points can all move in any direction. While the agents have a set speed, the body points do not and move as fast and the user makes them. 4. Implementation 4.1 Basic Implementation The team used WPF and C# to implement the game. The drivers used were OpenNI and Prime Sensor, which allow users to interface with a Kinect using C++ code and Visual Studio. Fortunately, the team was able to leverage connections with people in industry who have dealt with the complexity of interfacing with the Kinect. They found it simpler to write a wrapper in C# that would allow them to utilize the.net framework, specifically WPF and C#, to interface with the Kinect and write their own programs quicker. They offered us the.dll file for their wrapper class, and it enabled us to skip the large technological 8

9 barrier of interfacing with the Kinect and focus more on the Artificial Intelligence aspects of the program. This wrapper class allowed us to use simple APIs to get video and depth data, track users, and create and use recordings. We were also able to obtain a small sample project from a source that wishes to remain anonymous that allowed us to easily integrate with the Kinect. This project put video onto a blank window from the Kinect that we were able to adapt this project for our purposes. It s setup with two windows, a settings window and a game window, each with logic behind them, and an Agent class. The settings window and corresponding class allow the user to choose between the different agent generations and configure additional settings like the number of agents, speed of agents, vision distance, communication distance, game pace, etc. The game window and corresponding class display the agents, video of the user participating in the simulation, and provide feedback about points seen, points eaten, total time elapsed, and the total score upon game completion. The Agent class provides all of the details for each individual agent including the agent s position and movement information. It also provides methods for evaluating what the agent should do next in a number of different situations. 4.2 Agent Tests and Evaluation We plan to create an initial generation of agents, which will be modified and enhanced to increase its effectiveness. We will run a set of tests to evaluate our initially set of hypotheses with consistent, specified agent attributes to determine how the agent behaviors and other modifications affected the agent s ability to attack. The first and second hypotheses will be tested by running the game five times for each agent generation with a user just standing in the middle of the screen. The number of agents, agent speed, agent vision distance (when applicable), and communication distance (when applicable) will be consistent through the different agent generation tests. The third hypothesis will be evaluated by adjusting the communication distance in two hundred pixel increments from zero to one thousand pixels (allowing the agents to communicate with almost all other agents). This will allow us to evaluate how well the agent s were able to attack based on the user score and overall time until all body points are destroyed. 5. Hypotheses Hypothesis I: The Generation II agents, which demonstrate agent like behavior, will attack better than the Generation I agents, which randomly move around the environment with no goal-directed behavior. This will be determined based on user score and amount of time a game takes when the user simply stands. Hypothesis II: The Generation III agents, which will not only demonstrate agent goal-directed behavior, but will also communicate and attack cohesively, will attack better than the Generation II agents discussed above. This will be determined based on user score and amount of time a game takes when the user simply stands. Hypothesis III: When the Generation III agents have a larger communication distance, they will attack better than when they have a smaller communication distance. This will again be determined based on the user score and the time it takes for the agents to destroy all body points. 9

10 6. Results For our comparative experiments, we compared the average score, which is calculated based on the time in millisecond, the number of agents, and their speed, and time played when a user doesn t move from the middle of the screen among each generation of agents. From this comparison, we determined how the agent behaviors affect the quality of its attack. We also assessed the effects of changing the communication distance in Generation III agents to evaluate how communication distance affects the quality of cohesive attacks. In addition, Generation III agents with a communication distance of zero were compared to the completely individualistic attack demonstrated by Generation II agents. 6.1 Solution Evaluation Our solution met our functional goals. First, the user can easily understand how to play the game with minimal instruction, and the game provides video of the player and shows the agents that need to be avoided. Also, the user is provided with quantitative feedback indicating his or her overall performance (score) so that he or she can evaluate and compare his or her performances between games. In addition, immediate feedback is given to indicate when a body point has been destroyed. Finally, the game takes less than one minute to start and should not lag anytime while a user is playing. The solution also utilizes several artificial intelligence applications. The number of agents, agent speed, communication, and vision distances can be easily adjusted, and the effects of these changes can be easily observed by comparing a user s score from multiple different games. The modified generations of agents also display better decision-making abilities and behaviors than the first generation agents. Perhaps most importantly, the team has developed a better understanding of actual implementations of the algorithms and theories that we have studied in class. 6.2 Benchmarking Generation I Agents The Generation I agents were created primarily to provide us with a benchmark to evaluate the quality of successive generations. These agents randomly explore their environment and do not seek out body points. If a point is run into, such an agent will destroy it. However, these agents do not know anything about its environment. These agents act as though they are blind and do not use the cost-benefit algorithm to determine if it should move because these agents never have a target or goal to evaluate. The Generation I agent s attributes are shown on the next page in Table 1. In each of our five trials, all of the body points were destroyed in less than seven minutes. Table 2, shown on the next page, displays the results from the five trials. With the Generation I agents, many of the user s body points were quickly destroyed, but it took a significant amount of time for the last point to be randomly run into. The agents tend to travel on similar paths and will continue moving in a straight line until they are forced to randomly choose another direction, therefore, if a point is not on a direct agent s path it can take a significant amount of time for the agents to adjust its path enough to come into contact with the remaining point. We did not expect this generation of agents to perform especially well. These agents took a significant amount of time to destroy all of the user s body points. It was not fun to wait for all of the body points to be destroyed, therefore a large number of agent s with high speed and a high energy would be required for this generation of agent s to be used in a fun interactive game. 10

11 Table 1: Generation I Agents Attributes Agent Attribute Value Number of Agents 10 Agent Speed 10 Agent Vision Distance (pixels) 0 Agent Communication Distance (pixels) 0 Agent Energy 1000 Energy Bonus 250 Frames per Second 30 Switch Target Factor 1.2 Table 1 lists all Generation I Agent s specific attribute values that are used to set a benchmark for Generation II and Generation III Agents. Table 2: Generation I Experimental Data Trial User User Movement Time User Score 1 Eric None 6:37 3,971,400 2 Beth None 5:41 3,413,100 3 Geoffry None 5:10 3,107,800 4 Eric None 2:03 1,234,700 5 Jacob None 5:52 3,525,800 Average N/A N/A 5:04 3,050,560 Std. Deviation N/A N/A 1:46 1,061,350 Table 2 indicates all of the experimental values found for score and time when testing the Generation I Agents. The average of these values and the standard deviation is shown. The agent s performance is largely related to the agent s random starting points as is indicated by the large standard deviation. This randomness could result in very bad results or much better results based on the starting points and initial path taken. We ran five tests with the Generation I agents to establish a 11

12 robust benchmark we could use to evaluate the following generations. One redeeming attribute of these agents is they are not so greedy that they decide it is better for them not to move at all. Instead these agents will always continue to randomly roam covering the entire environmental area. This set of data will be primarily used as a benchmark for the Generation II and Generation III agents that actually demonstrate agent behaviors. We expect the Generation II and Generation III agents will show marked improvement over the Generation I agents. These agents do perform better than blind (vision distance of 0) Generation II or Generation III agents because the Generation II and Generation III agents will never choose move if they do not see any body points. The implementation of the cost-benefit decision for the Generation II and Generation III agents does not allow the agents to choose to use energy without any expectation of a benefit. Therefore, the Generation II and Generation III agents cannot move without first visualizing a target itself or receiving a message from another agent that had visualized an agent. This alludes to adaptation depicted in nature when the attributes of a species change. Even though the Generation II and Generation III agents perform much better with a reasonable vision distance, without this added ability the agent s ability to perform the cost benefit algorithm is worthless. The Generation I agents do not provide us with significant implications by themselves; however, the comparisons between this benchmark data will allow us to make reasonable conclusions and present implications of this work in the following sections. 6.3 Comparing Generation II Agents against the Benchmark: Does vision and a cost-benefit decision making help? The Generation II agents were created to be completely autonomous agents with no way to communicate or work as a cohesive unit. These agents were given the goal-directed behavior to maximize energy and destroy a user s body points. This led to the development of the algorithm for cost-benefit analysis as described above in 3.2 Agent Design. This algorithm is only used once an agent has visualized a body point and is deciding whether or not to attack that body point. The only attribute difference between the Generation I agents and the Generation II agents is an increase in the vision distance from 0 to 150 pixels. The screen is 640 pixels across, so the agent can see about one fourth of the width of the screen. The attributes for the Generation II agents are listed on the next page in Table 3. In each of our five trials for the Generation II agents, all of the body points were destroyed in significantly less time than the benchmark times from the Generation I agents. The best time posted by Generation I agents was two minutes and three second. The worst time for the Generation II Agents was nine seconds. Table 3, shown below, depicts the experimental results from the five Generation II Agent trials. With the Generation II agents, all of the body points were quickly destroyed. Occasionally, a couple agents would randomly spawn too far away from the body points to move at all. The benefit gained from destroying a point was not worth the energy it would cost to get there and the probability of it destroying the point decreases substantially as distance between the agent and the point increase. 12

13 Table 3: Generation II Agents Attributes Agent Attribute Value Number of Agents 10 Agent Speed 10 Agent Vision Distance (pixels) 150 Agent Communication Distance (pixels) 0 Agent Energy 1000 Energy Bonus 250 Frames per Second 30 Switch Target Factor 1.2 Table 3 lists all Generation II Agent s specific attribute values. Only the vision distance has been increased from the Generation I agent attributes. Table 4: Generation II Experimental Data Trial User User Movement Time User Score 1 Eric None 0: Jake None 0: Geoffry None 0: Jake None 0: Geoffry None 0: Average N/A N/A 0: Std. Deviation N/A N/A 0: Table 4 indicates all of the experimental values found for score and time when testing the Generation II Agents. The average of these values and the standard deviation is shown. We expected the agents that had vision would perform better than the agents that randomly roamed, but were surprised by how much better the Generation II agents performed compared to our benchmark Generation I agents. By providing the agents with vision, and having them set and attack specified targets, these agents demonstrated proactive behaviors. These agents were also able to react by updating and 13

14 adjusting their target points as the body points moves provided the agents were still within the agents vision distance. These changes to the agents made the game much more difficult and more exciting for a user. The agent s individual performance is still related to its randomly generated starting points. At times, some agents never attack or move because the costs of attacking points that are far away from the agent is too expensive compared to the benefit they might receive, or the agents never visualized the points and therefore never moved. For the Generation II agents, we ran five tests to establish a fair comparison value to analyze in regards to the established benchmark Generation I agent values. The Generation II agents outperformed the Generation I agents substantially in every test and in the averages based on time and score (a lower user score means the agents attack and destroyed all the body points quicker). The Generation II agents took on average 2.83 percent of the time the Generation I agents took on average. These results make sense to us because the agents that have a vision distance of zero and cannot see a body points never really attack any points. Sometimes they get lucky and run into body points, but they do not pursue the points. Generation II agents have a huge advantage over the Generation I agents for two main reasons. First, they have a vision distance and are able to see and then attack the closest body point to themselves. By first visualizing and then pursuing body points, the agent is always either moving towards the body point closest to itself, which it has the best chance of destroying, or resting and saving up energy. Second, even if the agents can see a body point, it will perform a cost analysis on this agent to determine whether or not the cost of destroying the agent is less than the benefit the agent would receive for destroying the point discounted based on the likelihood of the agent actually reaching and destroying the point. The agent will perform this analysis with each step it takes so that it can immediately stop and not waste any energy if the point begins to move too far or too fast away from it. These results have significant implications for research in this area. First, this project has demonstrated that agent like behavior outperforms the random-walk approach. Proving the agents with goal-directed behavior and allowing them to be reactive to the environment appears to be especially important. Second, it demonstrates the merits of a cost-benefit analysis in decision making. This section also outlines an approach to designing attack agents, which could be useful in many industries to simulate or control attacks. 6.4 Comparing Generation III Agents against the Benchmark: Does communication help? The Generation III agents are based off of the Generation II agents. The Generation III agents perform all of the same functionality as the Generation II agents with the addition of the ability to communicate. This communication is implemented so that the agents can share information regarding the location of body points. The agents in this Generation do not form teams or discuss how to attack together. We allow each individual agent to take all of the information it is provided based on what it can see and what it is told by other agents and make a greedy, individualistic decision about what it should do. We chose to use this approach because if each individual does what is best for themselves the best overall results will most likely occur. The communication distance for these agents was changed from being 0 to 400 pixels. The 14

15 screen is 640 pixels across, so the agents can communicate across most of the screen. The attributes for the Generation III agents are listed below in Table 5: Table 5: Generation III Agents Attributes Agent Attribute Value Number of Agents 10 Agent Speed 10 Agent Vision Distance (pixels) 150 Agent Communication Distance (pixels) 400 Agent Energy 1000 Energy Bonus 250 Frames per Second 30 Switch Target Factor 1.2 Table 5 lists all Generation III Agent s specific attribute values. From the Generation II agents, only the communication distance has been adjusted. We completed five trials for the Generation III agents. In all of these trials, the time and score was much better than the benchmark Generation I agent trials. The worst time posted by the Generation III agents was seventeen seconds compared to the best time of two minutes and three seconds from the Generation I agents. However, there was not, if any, benefit from the communication compared to the Generation II agents. Table 6, on the following page, depicts the experimental results from the five Generation III Agent trials. It appears as though the last test may be an outlier due to the random initially placing of the agents. Even excluding this point, the Generation III agents performed only minutely better than the Generation II agents. There was a visible change in the attack patterns by the agents, but some limiting factors may have diluted the potential benefit from the communication. We expected to see a marked improvement from the Generation III agents compared to the Generation II agents; however, the data did not support this expectation. We believe the following limitations in our specific experiment caused the communication to not outperform the non-communicating agents. First, our costs-benefit analysis will not allow agents to attack if the agent is so far away from the body point that the probability of reaching the body point multiplied by the benefit of destroying the point is less than the cost of reaching that point. This is especially evident in our experiment where all of the data points are close together in the center of the screen. Any agents that are spawned close to the edge away from the points will never choose to attack, even when they know where the body points are located. Additionally, since all of the body points are close together, one agent can quickly attack one body point right after another until all of the points are destroyed. One other potential issue is that we did not test each 15

16 generation of agents with a moving user that is attempting to avoid the agents. Without motion, the game goes very quickly. For the Generation II and Generation III agents, the amount of time it takes to destroy all of the body points is the time required for the agents to move directly to each point and destroy it. There is not much room for improvement, when there is a lower limit that is required for the agents to have time to move. The benefit of communication is largely realized when the user moves and body points adjust. The communication allows the agents collectively to attack more precisely and to follow the movement of the body points. Table 6: Generation III Experimental Data Trial User User Movement Time User Score 1 Geoffry None 0: Geoffry None 0: Jake None 0: Eric None 0: Eric None 0: Average N/A N/A 0: Std. Deviation N/A N/A 0: Table 2 indicates all of the experimental values found for score and time when testing the Generation I Agents. The average of these values and the standard deviation is shown. We were surprised that the Generation III agents did not outperform the Generation II agents, but after careful consideration found that these results were reasonable. The communication is only beneficial if an agent should be attacking a body point that it cannot see. This case is not very common, especially in our experiment; because the agent would need to be far enough away that it cannot visualize a body point, but close enough that the cost-benefit analysis will still indicate that it is beneficial for the agent to attack. One issue we recognize that led our agents to be less cohesive is the fact the agents broadcast points to attack, but do not broadcast information when it destroys a point. This can lead to agents pursuing points that no longer exist based on information communicated to them by other agents. The act of communicating information about points visualized is an altruistic act, benefits the groups as a whole, and increases cohesiveness. The act of not communicating information after a point is destroyed is selfish and does not benefit the collective. We allowed this because on a person often the points are close together. If an agent begins pursuing any point it should bring them close to all the other points. The implications from this set of experiments are unclear. We are still optimistic that the addition of communication can improve the agent attacks, even though the data outlined above does not prove that 16

17 this is the case with the attributes given. In the section 7.2 Future Work outlined below, we discuss several potential future experiments that could identify benefits the communication laid out in the Generation III agents. 6.5 Comparing the Benefits of Communication Distances with the Generation III Agents In this section, we worked to identify the benefit provided purely by the communication distance. We examined how changes to the communication distance affected the time and score when a user stood and did not move during the game. Our results were mixed, but generally we were unable to find a correlation between the time or score and the communication distance. Keeping in mind the conclusions and issues we faced when testing the Generation III agents themselves compared to the benchmark Generation I agents, we believe the completion of this experiment with different baseline agent attributes or with a different dispersal of target points (not on a person, but further apart) would result in very different experimental data. The data we compiled is shown on the following page in Graph 1 and Graph 2. We found that the middle communication distances had the most problems with clumping of the agents. This makes sense because the agents that have no communication can only attack targets that it can see. Therefore unless the agents all start together and can only see the same target, which is very unlikely with the randomly generated starting points, clumping will be fairly rare. In this situation, the agents that never visualize any points themselves will just keep resting and will not move towards the other agents or clump together to attack. Additionally, with very large communication distances, every agent will know where almost all, if not all, of the targets are initially and each agent can then be greedy and attack the point closest to itself. In this situation, you could see clumping if several agents start close together, the best point for all of them to attack could be the same point, but generally with the random starting points this is minimized. Unlike with very small communication distances, all of the agents will either see or be told where a target is that it could pursue. This does not mean that all of the agents will choose to attack, but all of them have a potential target to perform the cost-benefit analysis on. Just like the implications for the Generation III agents, the results here are very mixed. It appears as though the smaller communication distances and the very large communication distances were better, with the middle communication distances causing the agents to be too eager to attack. Additionally, since the agents do not communicate after they have destroyed a point, with the middle and upper distances an agent may choose to attack a point it will not be able to see for a while. This means it could move or be destroyed long before the attacking agent is close enough even to see that point. This could lead to the agent expending energy unnecessarily. This wasted energy could partially account for the poor results at the middle communication distances, where the agent is far enough away that it cannot see the target, but close enough that the cost-benefit analysis may indicate the agent should attack. 17

18 Score Time in seconds 0.18 Graph 1: Communication Distance vs Time Communication Distance in pixels Graph 1 demonstrates the distributed results based on the game time that arise from adjusting the communication distance of Generation III agents. There does not appear to be Graph 2: Communication Distance vs Score Communication Distance in pixels Graph 2 demonstrates the distributed results based on the score that arise from adjusting the communication distance of Generation III agents. There is no significant correlation between communication distance and score. 18

19 Score 6.6 Overall Results In our experiments it is easy to see the benefits from adding the vision and the agent goal-directed behavior; however, the results from the addition of communication were not as promising. As shown below in Graph 3, both the Generation II and Generation III agents greatly outperformed the Generation I agents that were built as a random-walk benchmark for comparison. The Generation II and Generation III agents both had an average below eleven seconds, while the Generation I agents had an average of more than five minutes. Graph 3: Experimental Scores from each Trial Generation I Generation II Generation III Trials Graph 3 displays the results from all the experimental trials. Clearly, the Generation II and Generation III agents outperformed the Generation I agents. Our first hypothesis indicated that we believe that the Generation II agents would outperform the benchmark Generation I agents. This was proven through our experiments and can be easily visualized in the data. Our second hypothesis indicated that we believe agents that communicate (Generation III) would outperform agents that do not communicate (Generation I and Generation II). This was not visualized by our data. We are optimistic there are situations where this relationship would be visualized, in our experiment the Generation II and Generation III agents performed very similarly as shown above in Graph 4. This hypothesis, our results, and potential causes for our findings are discussed more in Section 6.4 found above. 19

20 Score Graph 4: Experimental Scores from Trials Trials Generation II Generation III Graph 4 displays the results from the Generation II and Generation III experimental trials. Clearly, the Generation II and Generation III agents performed fairly similarly. Our third hypothesis indicated that we believe an increase in the communication distance will result in an increase in agent attacking performance. This was also not clearly shown in our data. This hypothesis, our results, and potential causes for our findings are discussed more in Section 6.5 found above. 7. Project Challenges 7.1 Software Development Challenges There are several challenges our team will face while creating this game. First, our project includes a significant level of technical difficulty. We will be utilizing the Kinect, a Microsoft webcam-style add-on for the Xbox 360, which will allow the user to see him or herself and user his or her body as the controller. Therefore, we will need to integrate the Kinect s capabilities and the data it provides on the user s movements with our program. This challenge was overcome relatively quickly. We were able to integrate our project with Kinect by mid-february. Another challenge we will face in our project will be constructing a user-friendly interface that integrates with the Kinect-provided video of the user that can be easily interpreted by the user. The user should immediately understand the purpose of the game based on the user interface and should be given feedback based on his or her performance throughout the game. One issue our group initially faced was determining how to identify whether or not a ball had hit a player. The Kinect provides several body points to use to identify where a person is, but does not indicate where the plane of the body is. This issue has caused us to change our scope and instead of creating a dodgeball game, we are creating a multi-agent attack simulation/game. This game will only need to determine if a specific point has been attacked and destroyed. 20

21 7.2 Artificial Intelligence Challenges Different generations of agents will have their own specific challenges. Generation I agents move randomly through the environment and did not present any significant artificial intelligence challenges. Generation II agents incorporate a cost-benefit analysis that allows the agents to demonstrate proactive behaviors. This algorithm was the source of significant debate due to its highly reactive nature as an agent s speed and vision is adjusted. Analyzing the benefit from an attack was difficult especially when several agents begin attacking the same point. The Generation III agents still use the greedy cost-benefit analysis from the Generation II agents, but also implement the ability to communicate. These agents will be able to communicate with each other through a master or control class. This class will know the location of each agent and will be used to evaluate how far the agents can see and communicate. To reduce complexity our Generation III agents still make decisions about what body points to attack based on the cost-benefit analysis. The communication increases the agent s abilities by allowing the agents to tell each other where they visualize all points. That way if one agent does not see a single body points, it can still begin attacking the point closest to itself based on the information it was provided by the agents that could see body points. The implementation of communication was very difficult. It was hard to establish what should be communicated and when and to whom. Additionally, we could not decide if communication should include the ability for multiple agents to coordinate a group attack. To reduce threading challenges, we did not use separate threads for each agent. Another challenge that partially led to this scope change was the difficulty in evaluating future rewards. The reinforcement algorithm was difficult to implement in this situation. This adjustment to the scope will minimize software engineering issues and will allow us to focus on evaluating how changes to the agents attributes affect their ability to attack. Other artificial intelligence challenges include testing our system and potential biases we have regarding the agents we built. Obviously, we wanted the Generation II and Generation III agents to perform better than the Generation I agents; however, we found that the most bias came when we were testing the Generation I agents. It was very tedious to stand in one position while the agents kept narrowly missing a couple remaining body points. After six minutes in each Generation I test, we found it was very easy to move very slightly so that an agent might run into the point sooner. If anything we biased the Generation I agents. We are not too concerned about this because the Generation II and Generation III agent s averages outperformed the Generation I agent s average case by more than ninety-five percent. To test our agents, we ran five tests with human subjects standing as still as possible while the agents attacked. It would have been more accurate to perform more tests for each generation without moving and to also conduct tests with a moving subject. We found it was difficult to design fair tests with a moving subject and the standard deviation in these situations would be very high and even more significant than it was in our documented tests. 8. Possible Extensions and Future Work 8.1 Possible Improvements Based on our work, we recognize several areas for possible improvements. First, implementing the costbenefit analysis differently or using a different algorithm could greatly affect results. These changes could include analyzing both the cost and benefit gained based on energy gain and loss and also by analyzing 21

Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function

Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function Davis Ancona and Jake Weiner Abstract In this report, we examine the plausibility of implementing a NEAT-based solution