Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER April 6, 2017

Similar documents
CS 730/830: Intro AI. Prof. Wheeler Ruml. TA Bence Cserna. Thinking inside the box. 5 handouts: course info, project info, schedule, slides, asst 1

COS 402 Machine Learning and Artificial Intelligence Fall Lecture 1: Intro

Game Artificial Intelligence ( CS 4731/7632 )

CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol

Hierarchical Controller for Robotic Soccer

CS343 Introduction to Artificial Intelligence Spring 2012

Reinforcement Learning for CPS Safety Engineering. Sam Green, Çetin Kaya Koç, Jieliang Luo University of California, Santa Barbara

CMSC 372 Artificial Intelligence. Fall Administrivia

CS221 Project Final Report Automatic Flappy Bird Player

CS 188: Artificial Intelligence Spring Announcements

CS6700: The Emergence of Intelligent Machines. Prof. Carla Gomes Prof. Bart Selman Cornell University

CS325 Artificial Intelligence Ch. 5, Games!

CS 309: Autonomous Intelligent Robotics FRI I. Instructor: Justin Hart.

Unit 12: Artificial Intelligence CS 101, Fall 2018

Swing Copters AI. Monisha White and Nolan Walsh Fall 2015, CS229, Stanford University

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

AlphaGo and Artificial Intelligence GUEST LECTURE IN THE GAME OF GO AND SOCIETY

Introduction to Computer Science with MakeCode for Minecraft

CS343 Introduction to Artificial Intelligence Spring 2010

Artificial Intelligence. Minimax and alpha-beta pruning

COMP219: Artificial Intelligence. Lecture 2: AI Problems and Applications

Poker AI: Equilibrium, Online Resolving, Deep Learning and Reinforcement Learning

CSC321 Lecture 23: Go

an AI for Slither.io

CS 188: Artificial Intelligence

LECTURE 26: GAME THEORY 1

CSE 40171: Artificial Intelligence. Adversarial Search: Games and Optimality

LECTURE 1: OVERVIEW. CS 4100: Foundations of AI. Instructor: Robert Platt. (some slides from Chris Amato, Magy Seif El-Nasr, and Stacy Marsella)

ECE 517: Reinforcement Learning in Artificial Intelligence

SDS PODCAST EPISODE 110 ALPHAGO ZERO

CS 188: Artificial Intelligence Spring Game Playing in Practice

Game-Playing & Adversarial Search

CS 188: Artificial Intelligence

COMP9414/ 9814/ 3411: Artificial Intelligence. Week 2. Classifying AI Tasks

Tutorial of Reinforcement: A Special Focus on Q-Learning

Learning Artificial Intelligence in Large-Scale Video Games

RISTO MIIKKULAINEN, SENTIENT ( SATIENT/) APRIL 3, :23 PM

Artificial Intelligence Adversarial Search

CS 4700: Foundations of Artificial Intelligence

What is Artificial Intelligence? Alternate Definitions (Russell + Norvig) Human intelligence

CMSC 671 Project Report- Google AI Challenge: Planet Wars

CMU Lecture 22: Game Theory I. Teachers: Gianni A. Di Caro

Decision Making in Multiplayer Environments Application in Backgammon Variants

Administrivia. CS 188: Artificial Intelligence Spring Agents and Environments. Today. Vacuum-Cleaner World. A Reflex Vacuum-Cleaner

Playful AI Education. Todd W. Neller Gettysburg College

Jane Li. Assistant Professor Mechanical Engineering Department, Robotic Engineering Program Worcester Polytechnic Institute

CS325 Artificial Intelligence Robotics I Autonomous Robots (Ch. 25)

Lab 11: GoFirst and Nim 12:00 PM, Nov 19, 2017

CMU-Q Lecture 20:

Computer Science Faculty Publications

CS10 The Beauty and Joy of Computing

Programming Project 1: Pacman (Due )

Analyzing the Impact of Knowledge and Search in Monte Carlo Tree Search in Go

CS 229 Final Project: Using Reinforcement Learning to Play Othello

Minecraft IRL (In Real Life) Ages: I held this program as a middle school afterschool event. You could easily scale the difficulty up or down.

CS 188: Artificial Intelligence. Overview

Adversarial Search. Hal Daumé III. Computer Science University of Maryland CS 421: Introduction to Artificial Intelligence 9 Feb 2012

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46.

Artificial Intelligence ( CS 365 ) IMPLEMENTATION OF AI SCRIPT GENERATOR USING DYNAMIC SCRIPTING FOR AOE2 GAME

How to Survive Your First Night in Minecraft

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Game Playing for a Variant of Mancala Board Game (Pallanguzhi)

REINFORCEMENT LEARNING (DD3359) O-03 END-TO-END LEARNING

History and Philosophical Underpinnings

Introduction and History of AI

Humanoid Robot NAO: Developing Behaviors for Football Humanoid Robots

CS10 The Beauty and Joy of Computing

Mastering Chess and Shogi by Self- Play with a General Reinforcement Learning Algorithm

CS 771 Artificial Intelligence. Adversarial Search

Artificial Intelligence: Definition

CS 4700: Foundations of Artificial Intelligence

Monte Carlo Tree Search

CRYPTOSHOOTER MULTI AGENT BASED SECRET COMMUNICATION IN AUGMENTED VIRTUALITY

Artificial Intelligence for Engineers. EE 562 Winter 2015

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

COMP9414: Artificial Intelligence Adversarial Search

Human Computation and Crowdsourcing Systems

Games (adversarial search problems)

CS343 Artificial Intelligence

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

CS510 \ Lecture Ariel Stolerman

Using Artificial intelligent to solve the game of 2048

Foundations of Artificial Intelligence Introduction State of the Art Summary. classification: Board Games: Overview

6. Games. COMP9414/ 9814/ 3411: Artificial Intelligence. Outline. Mechanical Turk. Origins. origins. motivation. minimax search

POKER AGENTS LD Miller & Adam Eck April 14 & 19, 2011

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón

DeepMind Self-Learning Atari Agent

Training a Back-Propagation Network with Temporal Difference Learning and a database for the board game Pente

CS 5522: Artificial Intelligence II

Board Game AIs. With a Focus on Othello. Julian Panetta March 3, 2010

CSC384 Intro to Artificial Intelligence* *The following slides are based on Fahiem Bacchus course lecture notes.

Artificial Intelligence

Reinforcement Learning Agent for Scrolling Shooter Game

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Experiments with Tensor Flow Roman Weber (Geschäftsführer) Richard Schmid (Senior Consultant)

Spring 19 Planning Techniques for Robotics Introduction; What is Planning for Robotics?

Welcome to CompSci 171 Fall 2010 Introduction to AI.

Chapter 4: Internal Economy. Hamzah Asyrani Sulaiman

CSSE220 BomberMan programming assignment Team Project

Transcription:

Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER 2017 April 6, 2017

Upcoming Misc. Check out course webpage and schedule Check out Canvas, especially for deadlines Do the survey by tomorrow, April 7, 2017 Homework Homework 1 will be up soon Meanwhile, install and get Malmo working Due: April 14, 2017 Project Teams are due April 17, 2017, Proposals April 21, 2017 Start assembling teams now! (use Piazza) Start thinking of project ideas CS 175: PROJECTS IN AI (SPRING 2017) 2

Projects in AI in Minecraft Project Overview Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 3

Projects in AI in Minecraft Project Overview Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 4

What is AI? "Artificial intelligence is anything computers can't do yet." - Douglas Hofstadter https://en.wikipedia.org/wiki/ai_effect CS 175: PROJECTS IN AI (SPRING 2017) 5

What can a project be? Research Do difficult things automatically, Minecraft is just a testbed Help players do things that are otherwise time-consuming Practical Tool Art Just cool! Use AI/ML to create stuff in the world CS 175: PROJECTS IN AI (SPRING 2017) 6

Technical Solution Use Artificial Intelligence or Machine Learning algorithms Artificial Intelligence Machine Learning Heuristic/Adversarial/Local Search Supervised Learning Logic Planning Bayesian Networks Unsupervised Learning Reinforcement Learning Natural Language Processing Computer Vision Recommendation Systems Computer Vision Constraint Satisfaction Time Series Modeling Deep Learning CS 175: PROJECTS IN AI (SPRING 2017) 7

Evaluation How would YOU define that your project was a success? Quantitative Evaluation Numerical Metrics: Accuracy, F1, AUC, Time to run, time to train Baselines: What would be currently used? What are reasonable simpler methods? By how much amount? We hope to improve the METRIC by AMOUNT over BASELINE! (I won t hold you to it, just want you to think about it) CS 175: PROJECTS IN AI (SPRING 2017) 8

Evaluation How would YOU define that your project was a success? Qualitative Evaluation Simple Example Cases: What are examples that your idea will definitely work on? What is the expected output on these? Error Analysis and Introspection: Are there plots/figures to verify the behavior? If it doesn t work, how will you improve it? The Super-Impressive Example What is the best example? awesome if it works E.g. something that perfectly captures your idea! CS 175: PROJECTS IN AI (SPRING 2017) 9

You will have doubts! Is it too simple? Is there data to train my classifier? Is it too ambitious? Is there a different algorithm I should use? Is my evaluation inappropriate? Can I only use off-the-shelf code? Every team has to meet me during Week 4. Use Piazza! Discussion will cover many simple situations Both TA and me are available for appointments CS 175: PROJECTS IN AI (SPRING 2017) 10

Projects in AI in Minecraft Project Overview Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 11

Projects in AI in Minecraft Course Information Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 12

Reinforcement Learning Agent learns to do things by trying things, and succeeding/failing Navigation Explore the map without dying Solve mazes Learn the best way home from anywhere Get to the highest hill in the map Learn Recipes Figure out best way to make items Without any knowledge of the recipes Combat Learn to hide/find shelter Learn to fight, example paper http://alekhagarwal.net/arxiv_geql.pdf CS 175: PROJECTS IN AI (SPRING 2017) 13

Reinforcement Learning Agent learns to do things by trying things, and succeeding/failing Observation What the agent sees Action What the agent can do Reward What the agent likes/dislikes New Item++ No Item- Goal++ Died--- CS 175: PROJECTS IN AI (SPRING 2017) 14

Reinforcement Learning Next few lectures will go into details (and more ideas) For now, let s look at non-rl ideas CS 175: PROJECTS IN AI (SPRING 2017) 15

Describe the Scene Houses and a pig on a grassy field during the day. Pig staring at me in a village. CS 175: PROJECTS IN AI (SPRING 2017) 16

Live Commentator Hit a rabbit CS 175: PROJECTS IN AI (SPRING 2017) 17

How is this even possible? 3 block in a line Grass blocks as floor Daylight, clear weather Malmo Training Signal 3 block in a line Deep Learning, CNN + LSTM Machine Learning CS 175: PROJECTS IN AI (SPRING 2017) 18

Many Variations of These Label x1000 Your code Agent/World in Malmo x100000 Render x100000 Label Machine Learning object objects action depth of pixel object detection ~caption generation ~action detection, commentary ~stereoscopy, depth/distance prediction CS 175: PROJECTS IN AI (SPRING 2017) 19

Captions to Speech Why are you making me read? Pig staring at me in a village. CS 175: PROJECTS IN AI (SPRING 2017) 20

Natural Language Navigation Quite Difficult! > Go forward till you hit a wall > Go to the pig > Go to the house on the right > Go behind the house trivial hardest CS 175: PROJECTS IN AI (SPRING 2017) 21

Natural Language Interface Quite Difficult! > Choose steel pickaxe and dig > Go and destroy that window > Put the blue block on the closest wall > Find a tree and chop it trivial hardest CS 175: PROJECTS IN AI (SPRING 2017) 22

SHRDLU (from 1970!) http://hci.stanford.edu/winograd/shrdlu/ CS 175: PROJECTS IN AI (SPRING 2017) 23

Natural Speech to Commands Why are you making me type? Off the shelf Speech to Text systems Online Speech to Text APIs CS 175: PROJECTS IN AI (SPRING 2017) 24

Photo to Minecraft Character Photo of a person Minecraft Skin Your Project Need to label data? Can you use existing classifiers, like Visual QA? CS 175: PROJECTS IN AI (SPRING 2017) 25

Recipe Planners Inventory Need (s) Steps > Get 2 wood planks > Make a stick > Get 2 diamonds > Make diamond sword CS 175: PROJECTS IN AI (SPRING 2017) 26

Lots of other possibilities Many other games in Minecraft Create AI for those? One AI that works for all of those? http://www.planetminecraft.com/ CS 175: PROJECTS IN AI (SPRING 2017) 27

Projects in AI in Minecraft Course Information Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 28

Projects in AI in Minecraft Course Information Some Project Ideas Introduction to Reinforcement Learning Based on slides by David Silver CS 175: PROJECTS IN AI (SPRING 2017) 29

Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 30

What makes it different? No direct supervision, only rewards Feedback is delayed, not instantaneous Time really matters, i.e. data is sequential Agent s actions affect what data it will receive Examples Fly stunt maneuvers in a helicopter Defeat the world champion at Backgammon Manage an investment portfolio Control a power station Make a humanoid robot walk Play many different Atari games better than humans Beat the world champion in Go CS 175: PROJECTS IN AI (SPRING 2017) 31

Agent-Environment Interface Agent decides on an action receives next observation receives next reward Environment executes the action computes next observation computes next reward CS 175: PROJECTS IN AI (SPRING 2017) 32

Reward, R t How well the agent is doing +, positive (Good) -, negative (Bad) Nothing about WHY it is doing well, could have little to do with A t-1 Agent is trying to maximize its cumulative reward CS 175: PROJECTS IN AI (SPRING 2017) 33

Example of Rewards Fly stunt maneuvers in a helicopter +ve reward for following desired trajectory ve reward for crashing Defeat the world champion at Backgammon +/ ve reward for winning/losing a game Manage an investment portfolio +ve reward for each $ in bank Control a power station +ve reward for producing power ve reward for exceeding safety thresholds Make a humanoid robot walk +ve reward for forward motion ve reward for falling over Play many different Atari games better than humans +/ ve reward for increasing/decreasing score CS 175: PROJECTS IN AI (SPRING 2017) 34

Sequential Decision Making Actions have long term consequences Rewards may be delayed May be better to sacrifice short term reward for long term benefit Examples A financial investment (may take months to mature) Refuelling a helicopter (might prevent a crash later) Blocking opponent moves (might eventually help win) Spend a lot of money and go to college (earn more later) Don t commit crimes (rewarded by not going to jail) Get started on Malmo/project soon (make it an easy quarter) A key aspect of intelligence, how far ahead are you able to plan? CS 175: PROJECTS IN AI (SPRING 2017) 35

Reinforcement Learning Given an environment (produces observations and rewards) Reinforcement Learning Automated agent that selects actions to maximize total rewards in the environment CS 175: PROJECTS IN AI (SPRING 2017) 36

Let s look at the Agent What does the choice of action depend on? Can you ignore O t completely? Is just O t enough? Or (O t,a t )? Is it last few observations? Is it all observations so far? CS 175: PROJECTS IN AI (SPRING 2017) 37

Agent State, S t History: everything that happened so far H t = O 1 R 1 A 1 O 2 R 2 A 2 O 3 R 3,,A t-1 O t R t State, S t can be O t O t R t A t-1 O t R t O t-3 O t-2 O t-1 O t In general, S t = f(h t ) You, as AI designer, specify this function CS 175: PROJECTS IN AI (SPRING 2017) 38

Agent Policy, π Current state S t π Next action A t Deterministic Policy: A # = π S # Stochastic Policy: π a s = P(A # = a S # = s) Good policy: Leads to larger cumulative reward Bad policy: Leads to worse cumulative reward (we will explore this more in the next week) CS 175: PROJECTS IN AI (SPRING 2017) 39

Example: Atari Rules are unknown What makes the score increase? Dynamics are unknown How do actions change pixels? CS 175: PROJECTS IN AI (SPRING 2017) 40

Video Time! https://www.youtube.com/watch?v=v1eynij0rnk CS 175: PROJECTS IN AI (SPRING 2017) 41

Example: Robotic Soccer https://www.youtube.com/watch?v=cif2sbvy-j0 CS 175: PROJECTS IN AI (SPRING 2017) 42

AlphaGo https://www.youtube.com/watch?v=i2wfvgl4y8c CS 175: PROJECTS IN AI (SPRING 2017) 43

Projects in AI in Minecraft Course Information Some Project Ideas Introduction to Reinforcement Learning CS 175: PROJECTS IN AI (SPRING 2017) 44