Project 2: Searching and Learning in Pac-Man

Similar documents
ADVANCED TOOLS AND TECHNIQUES: PAC-MAN GAME

Design task: Pacman. Software engineering Szoftvertechnológia. Dr. Balázs Simon BME, IIT

Project NMCGJ : Pac-Man Game

Clever Pac-man. Sistemi Intelligenti Reinforcement Learning: Fuzzy Reinforcement Learning

The Kapman Handbook. Thomas Gallinari

Free Cell Solver. Copyright 2001 Kevin Atkinson Shari Holstege December 11, 2001

Artificial Intelligence Lecture 3

Using Artificial intelligent to solve the game of 2048

All theory, no practice

Sokoban: Reversed Solving

Master Thesis. Enhancing Monte Carlo Tree Search by Using Deep Learning Techniques in Video Games

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

CSE 473 Midterm Exam Feb 8, 2018

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( )

22c:145 Artificial Intelligence

Computer Science. Using neural networks and genetic algorithms in a Pac-man game

Local Search. Hill Climbing. Hill Climbing Diagram. Simulated Annealing. Simulated Annealing. Introduction to Artificial Intelligence

Announcements. CS 188: Artificial Intelligence Fall Local Search. Hill Climbing. Simulated Annealing. Hill Climbing Diagram

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

AI Approaches to Ultimate Tic-Tac-Toe

A Hybrid Method of Dijkstra Algorithm and Evolutionary Neural Network for Optimal Ms. Pac-Man Agent

Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function

CS 5522: Artificial Intelligence II

CMSC 372: Artificial Intelligence Lab#1: Designing Pac-Man Agents

Reinforcement Learning to Train Ms. Pac-Man Using Higher-order Action-relative Inputs

Artificial Intelligence

An Influence Map Model for Playing Ms. Pac-Man

a b c d e f g h 1 a b c d e f g h C A B B A C C X X C C X X C C A B B A C Diagram 1-2 Square names

CSE 573: Artificial Intelligence Autumn 2010

Influence Map-based Controllers for Ms. PacMan and the Ghosts

University of Amsterdam. Encyclopedia of AI project. Tic-Tac-Toe. Authors: Andreas van Cranenburgh Ricus Smid. Supervisor: Maarten van Someren

CS 229 Final Project: Using Reinforcement Learning to Play Othello

Second Annual University of Oregon Programming Contest, 1998

Mittwoch, 14. September The Pelita contest (a brief introduction)

The Implementation of Artificial Intelligence and Machine Learning in a Computerized Chess Program

CS188: Artificial Intelligence, Fall 2011 Written 2: Games and MDP s

πgrammatical Evolution Genotype-Phenotype Map to

Monte-Carlo Tree Search in Ms. Pac-Man

CMSC 671 Project Report- Google AI Challenge: Planet Wars

Object-oriented Approach of Search Algorithms for Two-Player Games

For slightly more detailed instructions on how to play, visit:

Reactive Control of Ms. Pac Man using Information Retrieval based on Genetic Programming

Heuristics, and what to do if you don t know what to do. Carl Hultquist

AI Plays Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng)

HUJI AI Course 2012/2013. Bomberman. Eli Karasik, Arthur Hemed

A Model-based Approach to Optimizing Ms. Pac-Man Game Strategies in Real Time

Learning from Hints: AI for Playing Threes

Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN

CS 188 Introduction to Fall 2014 Artificial Intelligence Midterm

UMBC 671 Midterm Exam 19 October 2009

Game Playing State-of-the-Art. CS 188: Artificial Intelligence. Behavior from Computation. Video of Demo Mystery Pacman. Adversarial Search

Programming an Othello AI Michael An (man4), Evan Liang (liange)

Cassie Huang and Michael D Ambrosio Introductory Digital Systems Laboratory December 14, 2005

Words Mobile Ready Game Documentation

CSC 396 : Introduction to Artificial Intelligence

5.4 Imperfect, Real-Time Decisions

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Playing Othello Using Monte Carlo

Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! Due (in dropbox) Tuesday, September 23, 9:34am

SCRABBLE ARTIFICIAL INTELLIGENCE GAME. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University

Game Playing State-of-the-Art

CS 188: Artificial Intelligence

Introduction to Spring 2009 Artificial Intelligence Final Exam

DeepMind Self-Learning Atari Agent

Computing Science (CMPUT) 496

CS 188: Artificial Intelligence

Analyzing Games.

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

Name: Your EdX Login: SID: Name of person to left: Exam Room: Name of person to right: Primary TA:

The game of Reversi was invented around 1880 by two. Englishmen, Lewis Waterman and John W. Mollett. It later became

Inaction breeds doubt and fear. Action breeds confidence and courage. If you want to conquer fear, do not sit home and think about it.

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

Pacman unblocked games 77

Eleventh Annual Ohio Wesleyan University Programming Contest April 1, 2017 Rules: 1. There are six questions to be completed in four hours. 2.

CS 188: Artificial Intelligence Spring Announcements

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

AN ABSTRACT OF THE THESIS OF

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

arxiv: v1 [cs.ai] 9 Aug 2012

game tree complete all possible moves

Artificial Intelligence. Minimax and alpha-beta pruning

Tetris: A Heuristic Study

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Lab 1. Due: Friday, September 16th at 9:00 AM

Announcements. CS 188: Artificial Intelligence Fall Today. Tree-Structured CSPs. Nearly Tree-Structured CSPs. Tree Decompositions*

Solving a Rubik s Cube with IDA* Search and Neural Networks

CS 680: GAME AI INTRODUCTION TO GAME AI. 1/9/2012 Santiago Ontañón

Playing CHIP-8 Games with Reinforcement Learning

AI in Computer Games. AI in Computer Games. Goals. Game A(I?) History Game categories

Tutorial: Creating maze games

UNIT 13A AI: Games & Search Strategies

In this project you ll learn how to create a game, in which you have to match up coloured dots with the correct part of the controller.

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Optimal Rhode Island Hold em Poker

MultiPac 24 in 1 Installation Guide and User s Manual

Monte Carlo Tree Search. Simon M. Lucas

CS 4700: Foundations of Artificial Intelligence

TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

USING A FUZZY LOGIC CONTROL SYSTEM FOR AN XPILOT COMBAT AGENT ANDREW HUBLEY AND GARY PARKER

Transcription:

Project 2: Searching and Learning in Pac-Man December 3, 2009 1 Quick Facts In this project you have to code A* and Q-learning in the game of Pac-Man and answer some questions about your implementation. You will have to work in groups of 2 people. You have to turn in source code and documentation of your project. The documentation you turn in should answer the questions asked in each of the different parts of the project. The project is designed to be coded in Java, but you can use other languages 1. For any questions, email me at: santi@iiia.csic.es 2 Pac-Man Pac-Man is a Japanese arcade game developed by Namco (now Namco Bandai) and licensed for distribution in the U.S. by Midway, first released in Japan on May 22, 1980. Immensely popular in the United States from its original release to the present day, Pac-Man is universally considered as one of the classics of the medium, virtually synonymous with video games, and an icon of 1980s popular culture. Figure 1 shows a screenshot of the original Pac-Man. In the game, the player controls Pac-Man through a maze, eating dots. When all dots are eaten, Pac- Man is taken to the next stage. Four ghosts (known to most gamers as Blinky, Pinky, Inky and Clyde) roam the maze, trying to catch Pac-Man. If a ghost touches Pac-Man, a life is lost. When all lives have been lost, the game ends. Near the corners of the maze are four larger, flashing dots known as energizers or power pellets that provide Pac-Man with the temporary ability to eat the ghosts. The ghosts turn deep blue, reverse direction, and usually move more 1 if you are willing to code your own implementation of Pac-Man (or look for an alternative one) you can use any other programming language. Another option is to write code to connect the Java version of Pac-Man we provide with your favorite language. 1

Figure 1: A screenshot of the original PACMAN. slowly when Pac-Man eats an energizer. When a ghost is eaten, its eyes return to the ghost pen where it is regenerated in its normal color. Blue ghosts flash white before they become dangerous again and the amount of time the ghosts remain vulnerable varies from one board to the next, but the time period generally becomes shorter as the game progresses. In later stages, the ghosts do not change colors at all, but still reverse direction when an energizer is eaten. In addition to dots and energizers, bonus items, usually referred to as fruits (though not all items are fruits) appear near the center of the maze twice per level. These items score extra bonus points when eaten. The items change and bonus values increase throughout the game. The AI of the original Pac-Man was limited to the movement of the ghosts, and was indeed very well thought of. However, it used very basic techniques for two main reasons. First of all, the hardware in which it ran wouldn t support any computational intensive algorithm, and second, if the AI of the ghosts is improved, Pac-Man would become impossible to win. In this project you will actually implement more advanced AI techniques for Pac-Man, but from the reverse point of view: you will code AI for Pac-Man himself instead of for the ghosts. In this project you will have to implement and experiment with a collection of search and learning algorithms. The project is divided up into two major parts. A first one where you will experiment with search algorithms, and a second where you will do so with reinforcement learning algorithms. 2

3 Preliminaries: Setting Up your Environment The file PacManSrc.zip provided to you contains the source code of a Java open source implementation of Pac-Man designed specifically to test AI algorithms. Using your favorite Java development environment (Eclipse or Netbeans are recommended) create a project containing the source code we provided you. The main class is pacman.game, which you can run to run the game. Make sure you can run the game before proceeding with the project. By default, a very simple Pac-Man AI is coded into the game. You can experiment with the values of the static variables on the top of the pacman.game file to change the speed of the game (movetime), the number of ghosts (defaultnumberghosts), and other parameters that will come in handy for testing your algorithms later in the project. 3.1 Creating a Pac-Man AI Before starting with the main parts of this projects, let us see how to create a simple Pac-Man AI. So simple, that will just ask Pac-Man to move in a random direction. 1. Create a new class in the package player (call it as you prefer, I will call it RandomPacManPlayer) and make it implement the PacManPlayer interface. 2. Define the method public Move choosemove(game game) to look like this: p u b l i c Move choosemove (Game game ) { Random r = new Random ( ) ; Move [ ] moves = {Move.LEFT, Move.RIGHT, Move.UP, Move.DOWN} ; r eturn moves [ r. nextint ( 4 ) ] ; } 3. Now, go to the main method in the pacman.game class. And change the line: PacManPlayer pacman = new SimplePacManPlayer ( ) ; by PacManPlayer pacman = new RandomPacManPlayer ( ) ; 4. Run the game, your random Pac-Man should be controlling the game right now. 3

4 Part A: Pac-Man Searches Shortest Paths In this first part, we want you to implement an A* algorithm which will be used by Pac-Man to find the optimal path in which the dots can be eaten. Moreover, in order to do so, we will initially remove the ghosts from the game. Implement a first version of the PacManPlayer, which at each step uses A* to find the shortest path to a dot in the map, and starts moving in that direction. Design an appropriate heuristic, and make sure it is admissible. Implement a second version, which at each step uses A* to find the shortest path to eat all the dots in the map (notice that the search space is much, much larger this time). Design an appropriate heuristic, and make sure it is admissible. For each PacManPlayer, evaluate the number of nodes it explores during the search (min, max, and average). As well as comparing it to when no heuristic is used (just run your PamManPlayer with a heuristic which always returns 0). Notice that the second implementation might take too long to be usable in a real game. Optional improvements: why did we decide not to use ghosts? What happens when we add ghosts to the game? Can you describe or even implement a strategy which would still use A*, and that will take into account that there are ghosts in the game? 5 Part B: Pac-Man Learns to Play the Complete Game In this second part, we want you to implement a Q-learning algorithm, to help Pac-Man learn how to play the complete game. This time, we will add ghosts to the game again. In order to implement Q-learning, the very first thing you need is to decide on a suitable state space. Notice that if you try to have one state for each possible configuration of the game state you would have an enormous number of states and Q-learning would never converge (assuming you can hold the state table in memory). A possible way to reduce it can be to consider just a window around Pac-Man. For example, let s say that we only consider the cells that are immediately north, south, east or west of Pac-Man. And that we assume each cell can be either: empty, a wall, a dot or a ghost (there might be a ghost over a dot, but let s ignore that for now). With that representation we would only have 4 4 = 256 states. This is better, but might be to little and not lead to any interesting strategy. Can you come up with a state representation which captures the useful information in the game state for playing a good game of Pac-Man, but that generates a small number of states? (try to stay in the thousands range). Try to define a representation, where each state is identified by a unique integer. 4

Create a new PacManPlayer, which will use reinforcement learning to learn how to play the game. Once you have decided a suitable state representation, you need to define a function which given an instance of the State class, returns the integer representing the state it corresponds to. Decide a proper reinforcement signal. For instance +1 for eating a dot, -1 for being killed by a ghost (design your own, we are sure you can do better than this). Implement the learning operations of reinforcement learning over your state representation using Q-learning. Notice that initially, Pac-Man will play really bad (since it has not learned). Modify the game so that it lets your new PacManPlayer play multiple games in a row. Now, make sure that the Q-table learned from one game is transferred to the next game (so that Pac-Man learns more and more with every game it plays). Make Pac-Man play a sequence of games, and record how many points does he manage to score in each of them before killed. Notice that Pac-Man might need hundreds (or even thousands) or games to learn (depending on how good is the state representation you selected). Optional improvements: do you think the ghosts are smart? why don t you apply Q-learning also to the control of the ghosts? What happens when both Pac-Man and the ghosts use reinforcement learning at the same time? do they converge to a stable behavior? does Pac-Man manage to complete the game, or do the ghosts manage to always eat Pac-Man? 6 Part C: Final Questions Compare the A* approach with the reinforcement learning approach. Which are the benefits and drawbacks of each? Pac-Man is an apparently simple domain, do you see these techniques as applicable to larger domains (either computer games or real life applications)? 7 Bibliography Concerning the A* and Q-learning algorithms, you can use Artificial Intelligence: A Modern Approach by Stuart Russell and Peter Norvig. It is the reference book in artificial intelligence. Concerning Java, I d recommend Effective Java by Joshua Bloch, but most Java books will do. Additionally, feel free to use any language you use (although you ll have to code your own Pac-Man game!). 5