an AI for Slither.io
|
|
- Kelley Small
- 6 years ago
- Views:
Transcription
1 an AI for Slither.io Jackie Yang(jackiey) Introduction Game playing is a very interesting topic area in Artificial Intelligence today. Most of the recent emerging AI are for turn-based game, like the very popular Alpha Go (Silver et al. 2016), or are usually a solo or few palyers game, like a breakout AI (Mnih et al. 2015). However, there are many other interesting games that does not falls into this category, like Star Craft. They usually have a realtime game play experience with potentially massive amount of player participate in the game, which makes the game a lot more harder to solve. In this project, I m focusing on a very confined and easily modeled game with all these under explored features. Slither.io is a massively multiplayer browser game developed by Steve Howse ( Slither.io on the App Store ). Player controls a snake by changing its moving direction and eat food, represented by colored dots on the map to grow larger, which is the target of this game. Meanwhile, when one player s snake runs into other s player s snake, the snake would die and is converted into food dots. The game arose intense competitation between players who tries to let other players snakes run into their bodies. Task Definition For this project, the object is to create an AI that can control the snake (as showned as snake in center in figure 1) and keep eating and avoid danger. Specifically, the input would be all the food around the snake avatar (as showned as colored dots in figure 1 around the user s snake) and all the danger (nearby snakes) around the snake, while the output would be the direction of where the snake should go. The metric of success woule be the average maxium length that the snake controled by AI. Baseline and future improvement I consider a baseline of this project would be a snake wander around in random directions. An oracle for this project is quite straight forward, which is to become 1
2 Figure 1: Slither.io the longest player in the arena. There is a realtime learderboard in the game which includes the top 10 longest players currently playing in the arena. Related Work I have found some related work in the effort of creating a Slither.io AI. Slither.iobot is a very popular implementation of a Slither.io AI. It uses a rule-based way to give a action for the snake. The rule of this AI can be divided into two part, food finding and collision avoiding. The food finding is basically find the nearest food and head towards it. The collision is that once the bot detected any other snakes get into a circle defined by a static parameter, it would turn the other way to avoid collision. Currently, the rule-based AI is no where near other human players. the major reaosn is that the rule-based AI can be easily killed by others snakes who surrounded the AI with their bodies. As long as the surround circle is larger than that threshold of the rule-based AI, the AI wou t react to that action. My solution is to use a machine learning algorithm to dynamically tune that parameter to avoid this situation without becoming too timid. This training process could be done using reinforcement learning algorithm like Q-learning and I would go for simple classifier like linear classifier to make the AI runs fast. 2
3 Infrastructure The infrastructure running this AI would be a browser, a browser automation script and the AI, which is built already. The input data is like this: collisionpoints = '0' : { 'xx' : , 'yy' : , 'snake' : which shows all the obstacles and also the length of current snake. The output are just the parameter for the rule-based AI. Approach (new) As I working on this project throughtout the semaster, I graduately realize that this is a rather difficult problem to tackle with. I have tried quite a few approaches to make the AI. I will first discuss the general infrastructure for all of these designs and then describe each one of my AI that I have tried. General The general approach I planed to solve this problem is Q-learning with nerual network as stated (Mnih et al. 2015). The Q-learning algorithm is shown as follows: where ˆQ opt (s, a) (1 η) ˆQ opt (s, a) +η(r + γ }{{} ˆV opt (s ))] }{{} prediction target ˆV opt (s ) = max ˆQ opt (s, a ) a Actions(s ) As the s have a much larger space than as in mose Q-learning algorithm, I want to replace the list Q opt (s, a) with a nerual network. Then the Q opt (s, a) should be updated with stochastic gradient descent, therefore, the algorithm is showned as follows: train ˆQ opt with input of (s, a) and the result of(1 η) ˆQ opt (s, a) +η(r + γ }{{} ˆV opt (s ))] }{{} prediction target where 3
4 ˆV opt (s ) = max ˆQ opt (s, a ) a Actions(s ) However, there are still serveral difficulties to solve: 1. The competing-with-human nature of this game makes the game play really slow, how to speed up the training? 2. How to produce a continious action with only discrete policy from Q- learning. 3. How to select a proper feature vector. The first problem is a shared problem for all of the AI I designed. The human part of this game makes game play really slow and significantly slowed down the training. For example, if we consider 0.5 second in game as one turn for the AI, then we can only generate a total of = /0.5 samples in a day. I solve this problem letting the AI play serveral slither.io simutaneously with a shared predictor. All of those AI evaluate the situation and choose action using a single predictor. In the mean while, they give all the feedback to the same predictor to improve the policy (actually, the function estimation of Q) collborately. Another problem is that the training process is slow, however, because of the online, multiplayer nature of slither.io, we cannot pause the game while the predictor is training. I solve this problem by building 2 identical classifier, the game use one of them to predict the best action and use another to receive feedback in another thread. After a epoch, the program swap those 2 model in a atomic way to avoid hazard in threads and then quickly copy all the new parameter to the old model in a asynchronous way, and the latest-trained model can be used to produce the best action and the just updated model can be used to accept new feedbacks. Using this parallel method, the bottleneck of human players can be overcomed without building a self-competing system which can be inaccurate and does not represent how human player plays this game. For the rest of the problems, I have designed serveral AIs to tackle them: Rule-based AI with Q-learning parameter tuning This idea emerged when I m thinking ways to solve the second problem: how might I play a continious game while Q-learning is turn-based. I figured that I can use a rule-based to do continious control and use a Q-learning to tune the parameters of that rule-based AI. So I build a AI that have rule-based AI could be used to tackle short-term strategy and a reinforcement-learning-based AI could be used to give high-level prediction about high-level strategy. Namely, we adopt the same rule-based AI as described in related work and extracted 2 parameter inside that AI as a high-level instruction: radiusavoidsize and fastresponsesize, which guide how near should the snake to avoid possible danger and how near should the snake to do emergency maneuver to avoid 4
5 prominent danger. Notably, I divided game play to 2 seconds turn for the Q-learning AI. To design a proper feature vector, I focus on represent the game status as well as possible for the AI to deside what should those parameters be like. So I select the most important factor to make that decision: the distance of first 10 nearest dangers and their direction. As the snake grow larger, it might make sence that the snake should avoid danger in a bigger range as it is harder to turn a large snake around. So I also include snake length into the input factor, as both an factor for the nerual-network predictor and also a approach to calculate the reward. Q-learning AI with hand extracted feature and nerual network function approximation After a few days of training and exploring, I found the improvement is not that prominent and it is not easy to distinguish the improvement because of the many uncertainties in each game-play. I choose to build an AI from ground up using Q-learning. To make the AI as responsive as possible, I reduce the turn time for the AI to 0.5 seconds. With the experience of there previous AI and another tens of days of training, I came up this this feature vector. This new feature vector not only takes food locations into consideration, but also greatly improved the interpretability. I figured that in the previous implementation, the function approximation would have a hard time found out the relationship between the group of dangers directions and their distances. In the new design, I divide the map into 16 directions from the snake and I build a array consists of the distance of the thread from those 16 directions. To futher improve the AI, I build a matrix shows the relationship between those 16 direcitons, which is whether the nearest danger in each of those 2 direction is the same. In this way, the AI would have a idea of where is the danger and where each snakes are. During the training, I also noticed that the AI sometimes tries to kill other snakes, however, because lack of information, they sometimes go for their tail instead of their head. So I added another 16-elements boolean vector to show whether that danger in that direction is a snake head. For the food vector, I futher divide each one of the 16 directions into 4 regions according to their distances. So that I can build a 64-element vector shows the amount of food in different direction and different ranges of distances. 5
6 Q-learning AI with raw image input and convolutional neural networks Although I tried really hard on hand-picking those feature vector in the previous design of a slither.io AI. the AI is still not as informed as a human player. For example, the picture showed in the game play shows whether the user is in a snake-clowded area or not. This piece of information is very useful to human players to decide whether to rush wildly or play cautiously. To tackle this problem, I chose to refer to the method mentioned in the breakout AI (Mnih et al. 2015) paper, directly feeding the raw imagies to the Q-learning algorithm. Similar to the method mentioned in that paper, I use a convolutional nerual network to take advantage of the 2-d matrix shaped data, and also stacked 4 frames in a row and feed it to the machine-learning algorithm to get a better sence of velocity. Results The result is showed as follows, I uses length and turns as metrics: Baseline Rule-based AI Method 1 Method 2 Method 3 Length average Length stdev Turns average Turns stdev Note that the Rule-based AI and Method 1 have longer turns of 2 sec, other turns are all 0.5 sec. It seems that the Method 1 did improve the performance of Rule-based AI, however it is quite surprising to see that the Method 2 and 3 performed poorly. I assume this is due to the complexity of this game. The Q-learning based AI is very unlikely to discover that rushing to a cluster food is a good idea, as going to a cluster of food will usually caused the not well-performed AI to be killed, which yield in a very high penalty. In the mean while, a well-trained Q-learning AI is very timid, I observe that the trained AI in method 2 and 3 often rush to the corner of the map and stays there. They does not have the chance to walk to a crowded area and fight for food with other snakes just because of low possiblity of random exploration. 6
7 Discuss During the training, I found 2 figure very intriguing. Figure 2 shows the loss function in terms of training epoch, while figure 3 shows the length of the snake when died with a 100 elements sliding window during the Method 3 training. Figure 2: Loss and training epoches Figure 3: Length of each death This result shows that although the loss function keeps reducing, the average score does not keeps increasing. I assume that this proves my point above. The snake have better awareness of the situation, it knows that crowded places are dangerous. However, it did not have enough chance to try out different eating tactic and just given up all the crowded places. I think this explain the bad 7
8 performance of both method 2 and 3. I can think of a few very preliminary idea to solve this problem in the future: 1. Learn from human players: Let the AI watch human players play for a couple of rounds. Hopefully, the AI would learn from the organized tactic of human player and recognize that the crowded area are quite profitable. 2. Learn from opponents: Override the webpage and let it render the scene not only from the snake that the bot controls but also other opponents. Shows the situation and movement of other plays and train the function approximation of Q with that. Hopefully, the bot can learn from other s experience. 3. Learn from itself: Build a private slither.io server with only bot in it. As they all have very bad tactics, the crowded area might not be as dangerous. Then, the AI might be able to try out more tactics instead of hidding. Reference Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, et al Human-Level Control Through Deep Reinforcement Learning. Nature 518 (7540). Nature Publishing Group: Silver, David, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, et al Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature 529 (7587). Nature Publishing Group: Slither.io on the App Store. id ?mt=8. 8
Mastering the game of Go without human knowledge
Mastering the game of Go without human knowledge David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton,
More informationCS221 Final Project Report Learn to Play Texas hold em
CS221 Final Project Report Learn to Play Texas hold em Yixin Tang(yixint), Ruoyu Wang(rwang28), Chang Yue(changyue) 1 Introduction Texas hold em, one of the most popular poker games in casinos, is a variation
More informationVISUAL ANALOGIES BETWEEN ATARI GAMES FOR STUDYING TRANSFER LEARNING IN RL
VISUAL ANALOGIES BETWEEN ATARI GAMES FOR STUDYING TRANSFER LEARNING IN RL Doron Sobol 1, Lior Wolf 1,2 & Yaniv Taigman 2 1 School of Computer Science, Tel-Aviv University 2 Facebook AI Research ABSTRACT
More informationSwing Copters AI. Monisha White and Nolan Walsh Fall 2015, CS229, Stanford University
Swing Copters AI Monisha White and Nolan Walsh mewhite@stanford.edu njwalsh@stanford.edu Fall 2015, CS229, Stanford University 1. Introduction For our project we created an autonomous player for the game
More informationPlaying Atari Games with Deep Reinforcement Learning
Playing Atari Games with Deep Reinforcement Learning 1 Playing Atari Games with Deep Reinforcement Learning Varsha Lalwani (varshajn@iitk.ac.in) Masare Akshay Sunil (amasare@iitk.ac.in) IIT Kanpur CS365A
More informationMutliplayer Snake AI
Mutliplayer Snake AI CS221 Project Final Report Felix CREVIER, Sebastien DUBOIS, Sebastien LEVY 12/16/2016 Abstract This project is focused on the implementation of AI strategies for a tailor-made game
More informationA Deep Q-Learning Agent for the L-Game with Variable Batch Training
A Deep Q-Learning Agent for the L-Game with Variable Batch Training Petros Giannakopoulos and Yannis Cotronis National and Kapodistrian University of Athens - Dept of Informatics and Telecommunications
More informationReinforcement Learning Agent for Scrolling Shooter Game
Reinforcement Learning Agent for Scrolling Shooter Game Peng Yuan (pengy@stanford.edu) Yangxin Zhong (yangxin@stanford.edu) Zibo Gong (zibo@stanford.edu) 1 Introduction and Task Definition 1.1 Game Agent
More informationCS 229 Final Project: Using Reinforcement Learning to Play Othello
CS 229 Final Project: Using Reinforcement Learning to Play Othello Kevin Fry Frank Zheng Xianming Li ID: kfry ID: fzheng ID: xmli 16 December 2016 Abstract We built an AI that learned to play Othello.
More informationCS221 Project Final Report Deep Q-Learning on Arcade Game Assault
CS221 Project Final Report Deep Q-Learning on Arcade Game Assault Fabian Chan (fabianc), Xueyuan Mei (xmei9), You Guan (you17) Joint-project with CS229 1 Introduction Atari 2600 Assault is a game environment
More informationCreating an Agent of Doom: A Visual Reinforcement Learning Approach
Creating an Agent of Doom: A Visual Reinforcement Learning Approach Michael Lowney Department of Electrical Engineering Stanford University mlowney@stanford.edu Robert Mahieu Department of Electrical Engineering
More informationTraining a Minesweeper Solver
Training a Minesweeper Solver Luis Gardea, Griffin Koontz, Ryan Silva CS 229, Autumn 25 Abstract Minesweeper, a puzzle game introduced in the 96 s, requires spatial awareness and an ability to work with
More informationCreating a Poker Playing Program Using Evolutionary Computation
Creating a Poker Playing Program Using Evolutionary Computation Simon Olsen and Rob LeGrand, Ph.D. Abstract Artificial intelligence is a rapidly expanding technology. We are surrounded by technology that
More informationCandyCrush.ai: An AI Agent for Candy Crush
CandyCrush.ai: An AI Agent for Candy Crush Jiwoo Lee, Niranjan Balachandar, Karan Singhal December 16, 2016 1 Introduction Candy Crush, a mobile puzzle game, has become very popular in the past few years.
More informationGame Playing for a Variant of Mancala Board Game (Pallanguzhi)
Game Playing for a Variant of Mancala Board Game (Pallanguzhi) Varsha Sankar (SUNet ID: svarsha) 1. INTRODUCTION Game playing is a very interesting area in the field of Artificial Intelligence presently.
More informationCOMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( )
COMP3211 Project Artificial Intelligence for Tron game Group 7 Chiu Ka Wa (20369737) Chun Wai Wong (20265022) Ku Chun Kit (20123470) Abstract Tron is an old and popular game based on a movie of the same
More informationPlaying CHIP-8 Games with Reinforcement Learning
Playing CHIP-8 Games with Reinforcement Learning Niven Achenjang, Patrick DeMichele, Sam Rogers Stanford University Abstract We begin with some background in the history of CHIP-8 games and the use of
More informationProposal and Evaluation of System of Dynamic Adapting Method to Player s Skill
1,a) 1 2016 2 19, 2016 9 6 AI AI AI AI 0 AI 3 AI AI AI AI AI AI AI AI AI 5% AI AI Proposal and Evaluation of System of Dynamic Adapting Method to Player s Skill Takafumi Nakamichi 1,a) Takeshi Ito 1 Received:
More informationProf. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER April 6, 2017
Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER 2017 April 6, 2017 Upcoming Misc. Check out course webpage and schedule Check out Canvas, especially for deadlines Do the survey by tomorrow,
More informationDeep Learning for Infrastructure Assessment in Africa using Remote Sensing Data
Deep Learning for Infrastructure Assessment in Africa using Remote Sensing Data Pascaline Dupas Department of Economics, Stanford University Data for Development Initiative @ Stanford Center on Global
More informationUsing Artificial intelligent to solve the game of 2048
Using Artificial intelligent to solve the game of 2048 Ho Shing Hin (20343288) WONG, Ngo Yin (20355097) Lam Ka Wing (20280151) Abstract The report presents the solver of the game 2048 base on artificial
More informationCS221 Project Final Report Gomoku Game Agent
CS221 Project Final Report Gomoku Game Agent Qiao Tan qtan@stanford.edu Xiaoti Hu xiaotihu@stanford.edu 1 Introduction Gomoku, also know as five-in-a-row, is a strategy board game which is traditionally
More informationAutomated Suicide: An Antichess Engine
Automated Suicide: An Antichess Engine Jim Andress and Prasanna Ramakrishnan 1 Introduction Antichess (also known as Suicide Chess or Loser s Chess) is a popular variant of chess where the objective of
More informationBy David Anderson SZTAKI (Budapest, Hungary) WPI D2009
By David Anderson SZTAKI (Budapest, Hungary) WPI D2009 1997, Deep Blue won against Kasparov Average workstation can defeat best Chess players Computer Chess no longer interesting Go is much harder for
More informationConvNets and Forward Modeling for StarCraft AI
ConvNets and Forward Modeling for StarCraft AI Alex Auvolat September 15, 2016 ConvNets and Forward Modeling for StarCraft AI 1 / 20 Overview ConvNets and Forward Modeling for StarCraft AI 2 / 20 Section
More informationPlaying FPS Games with Deep Reinforcement Learning
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Playing FPS Games with Deep Reinforcement Learning Guillaume Lample, Devendra Singh Chaplot {glample,chaplot}@cs.cmu.edu
More informationPlaying Geometry Dash with Convolutional Neural Networks
Playing Geometry Dash with Convolutional Neural Networks Ted Li Stanford University CS231N tedli@cs.stanford.edu Sean Rafferty Stanford University CS231N CS231A seanraff@cs.stanford.edu Abstract The recent
More informationCSE 258 Winter 2017 Assigment 2 Skill Rating Prediction on Online Video Game
ABSTRACT CSE 258 Winter 2017 Assigment 2 Skill Rating Prediction on Online Video Game In competitive online video game communities, it s common to find players complaining about getting skill rating lower
More informationGame Artificial Intelligence ( CS 4731/7632 )
Game Artificial Intelligence ( CS 4731/7632 ) Instructor: Stephen Lee-Urban http://www.cc.gatech.edu/~surban6/2018-gameai/ (soon) Piazza T-square What s this all about? Industry standard approaches to
More informationHeads-up Limit Texas Hold em Poker Agent
Heads-up Limit Texas Hold em Poker Agent Nattapoom Asavareongchai and Pin Pin Tea-mangkornpan CS221 Final Project Report Abstract Our project aims to create an agent that is able to play heads-up limit
More informationEvaluating Persuasion Strategies and Deep Reinforcement Learning methods for Negotiation Dialogue agents
Evaluating Persuasion Strategies and Deep Reinforcement Learning methods for Negotiation Dialogue agents Simon Keizer 1, Markus Guhe 2, Heriberto Cuayáhuitl 3, Ioannis Efstathiou 1, Klaus-Peter Engelbrecht
More informationPlaying Angry Birds with a Neural Network and Tree Search
Playing Angry Birds with a Neural Network and Tree Search Yuntian Ma, Yoshina Takano, Enzhi Zhang, Tomohiro Harada, and Ruck Thawonmas Intelligent Computer Entertainment Laboratory Graduate School of Information
More informationUsing Neural Network and Monte-Carlo Tree Search to Play the Game TEN
Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN Weijie Chen Fall 2017 Weijie Chen Page 1 of 7 1. INTRODUCTION Game TEN The traditional game Tic-Tac-Toe enjoys people s favor. Moreover,
More informationCombining tactical search and deep learning in the game of Go
Combining tactical search and deep learning in the game of Go Tristan Cazenave PSL-Université Paris-Dauphine, LAMSADE CNRS UMR 7243, Paris, France Tristan.Cazenave@dauphine.fr Abstract In this paper we
More informationLearning from Hints: AI for Playing Threes
Learning from Hints: AI for Playing Threes Hao Sheng (haosheng), Chen Guo (cguo2) December 17, 2016 1 Introduction The highly addictive stochastic puzzle game Threes by Sirvo LLC. is Apple Game of the
More informationCS 354R: Computer Game Technology
CS 354R: Computer Game Technology Introduction to Game AI Fall 2018 What does the A stand for? 2 What is AI? AI is the control of every non-human entity in a game The other cars in a car game The opponents
More informationTraining a Back-Propagation Network with Temporal Difference Learning and a database for the board game Pente
Training a Back-Propagation Network with Temporal Difference Learning and a database for the board game Pente Valentijn Muijrers 3275183 Valentijn.Muijrers@phil.uu.nl Supervisor: Gerard Vreeswijk 7,5 ECTS
More informationOthello/Reversi using Game Theory techniques Parth Parekh Urjit Singh Bhatia Kushal Sukthankar
Othello/Reversi using Game Theory techniques Parth Parekh Urjit Singh Bhatia Kushal Sukthankar Othello Rules Two Players (Black and White) 8x8 board Black plays first Every move should Flip over at least
More informationGoogle DeepMind s AlphaGo vs. world Go champion Lee Sedol
Google DeepMind s AlphaGo vs. world Go champion Lee Sedol Review of Nature paper: Mastering the game of Go with Deep Neural Networks & Tree Search Tapani Raiko Thanks to Antti Tarvainen for some slides
More informationCS221 Project Final Report Automatic Flappy Bird Player
1 CS221 Project Final Report Automatic Flappy Bird Player Minh-An Quinn, Guilherme Reis Introduction Flappy Bird is a notoriously difficult and addicting game - so much so that its creator even removed
More informationArtificial Intelligence
Torralba and Wahlster Artificial Intelligence Chapter 6: Adversarial Search 1/57 Artificial Intelligence 6. Adversarial Search What To Do When Your Solution is Somebody Else s Failure Álvaro Torralba Wolfgang
More informationSoftware Development of the Board Game Agricola
CARLETON UNIVERSITY Software Development of the Board Game Agricola COMP4905 Computer Science Honours Project Robert Souter Jean-Pierre Corriveau Ph.D., Associate Professor, School of Computer Science
More informationAgenda Artificial Intelligence. Why AI Game Playing? The Problem. 6. Adversarial Search What To Do When Your Solution is Somebody Else s Failure
Agenda Artificial Intelligence 6. Adversarial Search What To Do When Your Solution is Somebody Else s Failure 1 Introduction 2 Minimax Search Álvaro Torralba Wolfgang Wahlster 3 Evaluation Functions 4
More informationReinforcement Learning in Games Autonomous Learning Systems Seminar
Reinforcement Learning in Games Autonomous Learning Systems Seminar Matthias Zöllner Intelligent Autonomous Systems TU-Darmstadt zoellner@rbg.informatik.tu-darmstadt.de Betreuer: Gerhard Neumann Abstract
More informationFive-In-Row with Local Evaluation and Beam Search
Five-In-Row with Local Evaluation and Beam Search Jiun-Hung Chen and Adrienne X. Wang jhchen@cs axwang@cs Abstract This report provides a brief overview of the game of five-in-row, also known as Go-Moku,
More informationAI Agent for Ants vs. SomeBees: Final Report
CS 221: ARTIFICIAL INTELLIGENCE: PRINCIPLES AND TECHNIQUES 1 AI Agent for Ants vs. SomeBees: Final Report Wanyi Qian, Yundong Zhang, Xiaotong Duan Abstract This project aims to build a real-time game playing
More informationBachelor thesis. Influence map based Ms. Pac-Man and Ghost Controller. Johan Svensson. Abstract
2012-07-02 BTH-Blekinge Institute of Technology Uppsats inlämnad som del av examination i DV1446 Kandidatarbete i datavetenskap. Bachelor thesis Influence map based Ms. Pac-Man and Ghost Controller Johan
More informationComp 3211 Final Project - Poker AI
Comp 3211 Final Project - Poker AI Introduction Poker is a game played with a standard 52 card deck, usually with 4 to 8 players per game. During each hand of poker, players are dealt two cards and must
More informationReinforcement Learning for CPS Safety Engineering. Sam Green, Çetin Kaya Koç, Jieliang Luo University of California, Santa Barbara
Reinforcement Learning for CPS Safety Engineering Sam Green, Çetin Kaya Koç, Jieliang Luo University of California, Santa Barbara Motivations Safety-critical duties desired by CPS? Autonomous vehicle control:
More informationAugmenting Self-Learning In Chess Through Expert Imitation
Augmenting Self-Learning In Chess Through Expert Imitation Michael Xie Department of Computer Science Stanford University Stanford, CA 94305 xie@cs.stanford.edu Gene Lewis Department of Computer Science
More informationSpatial Average Pooling for Computer Go
Spatial Average Pooling for Computer Go Tristan Cazenave Université Paris-Dauphine PSL Research University CNRS, LAMSADE PARIS, FRANCE Abstract. Computer Go has improved up to a superhuman level thanks
More informationCS221 Project: Final Report Raiden AI Agent
CS221 Project: Final Report Raiden AI Agent Lu Bian lbian@stanford.edu Yiran Deng yrdeng@stanford.edu Xuandong Lei xuandong@stanford.edu 1 Introduction Raiden is a classic shooting game where the player
More informationSIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB
SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB S. Kajan, J. Goga Institute of Robotics and Cybernetics, Faculty of Electrical Engineering and Information Technology, Slovak University
More informationGridiron-Gurus Final Report
Gridiron-Gurus Final Report Kyle Tanemura, Ryan McKinney, Erica Dorn, Michael Li Senior Project Dr. Alex Dekhtyar June, 2017 Contents 1 Introduction 1 2 Player Performance Prediction 1 2.1 Components of
More informationAI Plays Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng)
AI Plays 2048 Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng) Abstract The strategy game 2048 gained great popularity quickly. Although it is easy to play, people cannot win the game easily,
More informationarxiv: v1 [cs.lg] 30 May 2016
Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent Timothy J O Shea and T. Charles Clancy Virginia Polytechnic Institute and State University arxiv:1605.09221v1
More informationarxiv: v1 [cs.lg] 7 Nov 2016
PLAYING SNES IN THE RETRO LEARNING ENVIRONMENT Nadav Bhonker*, Shai Rozenberg* and Itay Hubara Department of Electrical Engineering Technion, Israel Institute of Technology (*) indicates equal contribution
More informationCS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions
CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions Slides by Svetlana Lazebnik, 9/2016 Modified by Mark Hasegawa Johnson, 9/2017 Types of game environments Perfect
More informationBotzone: A Game Playing System for Artificial Intelligence Education
Botzone: A Game Playing System for Artificial Intelligence Education Haifeng Zhang, Ge Gao, Wenxin Li, Cheng Zhong, Wenyuan Yu and Cheng Wang Department of Computer Science, Peking University, Beijing,
More informationPlaying Othello Using Monte Carlo
June 22, 2007 Abstract This paper deals with the construction of an AI player to play the game Othello. A lot of techniques are already known to let AI players play the game Othello. Some of these techniques
More informationArtificial Intelligence. Minimax and alpha-beta pruning
Artificial Intelligence Minimax and alpha-beta pruning In which we examine the problems that arise when we try to plan ahead to get the best result in a world that includes a hostile agent (other agent
More informationProgramming an Othello AI Michael An (man4), Evan Liang (liange)
Programming an Othello AI Michael An (man4), Evan Liang (liange) 1 Introduction Othello is a two player board game played on an 8 8 grid. Players take turns placing stones with their assigned color (black
More informationArtificial Intelligence
Hoffmann and Wahlster Artificial Intelligence Chapter 6: Adversarial Search 1/54 Artificial Intelligence 6. Adversarial Search What To Do When Your Solution is Somebody Else s Failure Jörg Hoffmann Wolfgang
More informationPresentation Overview. Bootstrapping from Game Tree Search. Game Tree Search. Heuristic Evaluation Function
Presentation Bootstrapping from Joel Veness David Silver Will Uther Alan Blair University of New South Wales NICTA University of Alberta A new algorithm will be presented for learning heuristic evaluation
More informationOptimal Yahtzee performance in multi-player games
Optimal Yahtzee performance in multi-player games Andreas Serra aserra@kth.se Kai Widell Niigata kaiwn@kth.se April 12, 2013 Abstract Yahtzee is a game with a moderately large search space, dependent on
More informationHierarchical Controller for Robotic Soccer
Hierarchical Controller for Robotic Soccer Byron Knoll Cognitive Systems 402 April 13, 2008 ABSTRACT RoboCup is an initiative aimed at advancing Artificial Intelligence (AI) and robotics research. This
More informationTransfer Deep Reinforcement Learning in 3D Environments: An Empirical Study
Transfer Deep Reinforcement Learning in 3D Environments: An Empirical Study Devendra Singh Chaplot School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 chaplot@cs.cmu.edu Kanthashree
More informationMEI Conference Short Open-Ended Investigations for KS3
MEI Conference 2012 Short Open-Ended Investigations for KS3 Kevin Lord Kevin.lord@mei.org.uk 10 Ideas for Short Investigations These are some of the investigations that I have used many times with a variety
More informationCrowd-steering behaviors Using the Fame Crowd Simulation API to manage crowds Exploring ANT-Op to create more goal-directed crowds
In this chapter, you will learn how to build large crowds into your game. Instead of having the crowd members wander freely, like we did in the previous chapter, we will control the crowds better by giving
More informationAlphaGo and Artificial Intelligence GUEST LECTURE IN THE GAME OF GO AND SOCIETY
AlphaGo and Artificial Intelligence HUCK BENNET T (NORTHWESTERN UNIVERSITY) GUEST LECTURE IN THE GAME OF GO AND SOCIETY AT OCCIDENTAL COLLEGE, 10/29/2018 The Game of Go A game for aliens, presidents, and
More informationHUJI AI Course 2012/2013. Bomberman. Eli Karasik, Arthur Hemed
HUJI AI Course 2012/2013 Bomberman Eli Karasik, Arthur Hemed Table of Contents Game Description...3 The Original Game...3 Our version of Bomberman...5 Game Settings screen...5 The Game Screen...6 The Progress
More information2048: An Autonomous Solver
2048: An Autonomous Solver Final Project in Introduction to Artificial Intelligence ABSTRACT. Our goal in this project was to create an automatic solver for the wellknown game 2048 and to analyze how different
More informationCS 4700: Foundations of Artificial Intelligence
CS 4700: Foundations of Artificial Intelligence selman@cs.cornell.edu Module: Adversarial Search R&N: Chapter 5 1 Outline Adversarial Search Optimal decisions Minimax α-β pruning Case study: Deep Blue
More informationthe gamedesigninitiative at cornell university Lecture 6 Uncertainty & Risk
Lecture 6 Uncertainty and Risk Risk: outcome of action is uncertain Perhaps action has random results May depend upon opponent s actions Need to know what opponent will do Two primary means of risk in
More informationBootstrapping from Game Tree Search
Joel Veness David Silver Will Uther Alan Blair University of New South Wales NICTA University of Alberta December 9, 2009 Presentation Overview Introduction Overview Game Tree Search Evaluation Functions
More informationFor slightly more detailed instructions on how to play, visit:
Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! The purpose of this assignment is to program some of the search algorithms and game playing strategies that we have learned
More informationBLUFF WITH AI. CS297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University. In Partial Fulfillment
BLUFF WITH AI CS297 Report Presented to Dr. Chris Pollett Department of Computer Science San Jose State University In Partial Fulfillment Of the Requirements for the Class CS 297 By Tina Philip May 2017
More informationPoker AI: Equilibrium, Online Resolving, Deep Learning and Reinforcement Learning
Poker AI: Equilibrium, Online Resolving, Deep Learning and Reinforcement Learning Nikolai Yakovenko NVidia ADLR Group -- Santa Clara CA Columbia University Deep Learning Seminar April 2017 Poker is a Turn-Based
More informationREINFORCEMENT LEARNING (DD3359) O-03 END-TO-END LEARNING
REINFORCEMENT LEARNING (DD3359) O-03 END-TO-END LEARNING RIKA ANTONOVA ANTONOVA@KTH.SE ALI GHADIRZADEH ALGH@KTH.SE RL: What We Know So Far Formulate the problem as an MDP (or POMDP) State space captures
More informationIMPROVING TOWER DEFENSE GAME AI (DIFFERENTIAL EVOLUTION VS EVOLUTIONARY PROGRAMMING) CHEAH KEEI YUAN
IMPROVING TOWER DEFENSE GAME AI (DIFFERENTIAL EVOLUTION VS EVOLUTIONARY PROGRAMMING) CHEAH KEEI YUAN FACULTY OF COMPUTING AND INFORMATICS UNIVERSITY MALAYSIA SABAH 2014 ABSTRACT The use of Artificial Intelligence
More informationTetris: A Heuristic Study
Tetris: A Heuristic Study Using height-based weighing functions and breadth-first search heuristics for playing Tetris Max Bergmark May 2015 Bachelor s Thesis at CSC, KTH Supervisor: Örjan Ekeberg maxbergm@kth.se
More informationAndrei Behel AC-43И 1
Andrei Behel AC-43И 1 History The game of Go originated in China more than 2,500 years ago. The rules of the game are simple: Players take turns to place black or white stones on a board, trying to capture
More informationTEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS
TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS Thong B. Trinh, Anwer S. Bashi, Nikhil Deshpande Department of Electrical Engineering University of New Orleans New Orleans, LA 70148 Tel: (504) 280-7383 Fax:
More informationCS 188: Artificial Intelligence
CS 188: Artificial Intelligence Adversarial Search Prof. Scott Niekum The University of Texas at Austin [These slides are based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.
More informationExtending the STRADA Framework to Design an AI for ORTS
Extending the STRADA Framework to Design an AI for ORTS Laurent Navarro and Vincent Corruble Laboratoire d Informatique de Paris 6 Université Pierre et Marie Curie (Paris 6) CNRS 4, Place Jussieu 75252
More informationReinforcement Learning Applied to a Game of Deceit
Reinforcement Learning Applied to a Game of Deceit Theory and Reinforcement Learning Hana Lee leehana@stanford.edu December 15, 2017 Figure 1: Skull and flower tiles from the game of Skull. 1 Introduction
More informationDota2 is a very popular video game currently.
Dota2 Outcome Prediction Zhengyao Li 1, Dingyue Cui 2 and Chen Li 3 1 ID: A53210709, Email: zhl380@eng.ucsd.edu 2 ID: A53211051, Email: dicui@eng.ucsd.edu 3 ID: A53218665, Email: lic055@eng.ucsd.edu March
More informationCS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón
CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH Santiago Ontañón so367@drexel.edu Recall: Adversarial Search Idea: When there is only one agent in the world, we can solve problems using DFS, BFS, ID,
More informationNeural Networks for Real-time Pathfinding in Computer Games
Neural Networks for Real-time Pathfinding in Computer Games Ross Graham 1, Hugh McCabe 1 & Stephen Sheridan 1 1 School of Informatics and Engineering, Institute of Technology at Blanchardstown, Dublin
More informationLearning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi
Learning to Play like an Othello Master CS 229 Project Report December 13, 213 1 Abstract This project aims to train a machine to strategically play the game of Othello using machine learning. Prior to
More informationThe Art of Neural Nets
The Art of Neural Nets Marco Tavora marcotav65@gmail.com Preamble The challenge of recognizing artists given their paintings has been, for a long time, far beyond the capability of algorithms. Recent advances
More informationA RESEARCH PAPER ON ENDLESS FUN
A RESEARCH PAPER ON ENDLESS FUN Nizamuddin, Shreshth Kumar, Rishab Kumar Department of Information Technology, SRM University, Chennai, Tamil Nadu ABSTRACT The main objective of the thesis is to observe
More informationFATE WEAVER. Lingbing Jiang U Final Game Pitch
FATE WEAVER Lingbing Jiang U0746929 Final Game Pitch Table of Contents Introduction... 3 Target Audience... 3 Requirement... 3 Connection & Calibration... 4 Tablet and Table Detection... 4 Table World...
More informationCity Research Online. Permanent City Research Online URL:
Child, C. H. T. & Trusler, B. P. (2014). Implementing Racing AI using Q-Learning and Steering Behaviours. Paper presented at the GAMEON 2014 (15th annual European Conference on Simulation and AI in Computer
More informationPredicting Video Game Popularity With Tweets
Predicting Video Game Popularity With Tweets Casey Cabrales (caseycab), Helen Fang (hfang9) December 10,2015 Task Definition Given a set of Twitter tweets from a given day, we want to determine the peak
More informationDeep Green. System for real-time tracking and playing the board game Reversi. Final Project Submitted by: Nadav Erell
Deep Green System for real-time tracking and playing the board game Reversi Final Project Submitted by: Nadav Erell Introduction to Computational and Biological Vision Department of Computer Science, Ben-Gurion
More informationLEARNABLE BUDDY: LEARNABLE SUPPORTIVE AI IN COMMERCIAL MMORPG
LEARNABLE BUDDY: LEARNABLE SUPPORTIVE AI IN COMMERCIAL MMORPG Theppatorn Rhujittawiwat and Vishnu Kotrajaras Department of Computer Engineering Chulalongkorn University, Bangkok, Thailand E-mail: g49trh@cp.eng.chula.ac.th,
More informationTD-Leaf(λ) Giraffe: Using Deep Reinforcement Learning to Play Chess. Stefan Lüttgen
TD-Leaf(λ) Giraffe: Using Deep Reinforcement Learning to Play Chess Stefan Lüttgen Motivation Learn to play chess Computer approach different than human one Humans search more selective: Kasparov (3-5
More informationInference of Opponent s Uncertain States in Ghosts Game using Machine Learning
Inference of Opponent s Uncertain States in Ghosts Game using Machine Learning Sehar Shahzad Farooq, HyunSoo Park, and Kyung-Joong Kim* sehar146@gmail.com, hspark8312@gmail.com,kimkj@sejong.ac.kr* Department
More informationCMSC 671 Project Report- Google AI Challenge: Planet Wars
1. Introduction Purpose The purpose of the project is to apply relevant AI techniques learned during the course with a view to develop an intelligent game playing bot for the game of Planet Wars. Planet
More information