CS 4700: Foundations of Artificial Intelligence

Similar documents
ARTIFICIAL INTELLIGENCE (CS 370D)

CS 4700: Foundations of Artificial Intelligence

2 person perfect information

Games (adversarial search problems)

Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN

Monday, February 2, Is assigned today. Answers due by noon on Monday, February 9, 2015.

the gamedesigninitiative at cornell university Lecture 23 Strategic AI

Artificial Intelligence. Minimax and alpha-beta pruning

Algorithms for Data Structures: Search for Games. Phillip Smith 27/11/13

Wednesday, February 1, 2017

Lecture 33: How can computation Win games against you? Chess: Mechanical Turk

Evolutionary Computation for Creativity and Intelligence. By Darwin Johnson, Alice Quintanilla, and Isabel Tweraser

A Tic Tac Toe Learning Machine Involving the Automatic Generation and Application of Heuristics

Contents. MA 327/ECO 327 Introduction to Game Theory Fall 2017 Notes. 1 Wednesday, August Friday, August Monday, August 28 6

Announcements. CS 188: Artificial Intelligence Spring Game Playing State-of-the-Art. Overview. Game Playing. GamesCrafters

CS 188: Artificial Intelligence Spring Announcements

CS 331: Artificial Intelligence Adversarial Search II. Outline

Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning

AI Approaches to Ultimate Tic-Tac-Toe

CS 2710 Foundations of AI. Lecture 9. Adversarial search. CS 2710 Foundations of AI. Game search

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Adversary Search. Ref: Chapter 5

The Sweet Learning Computer

CS 4700: Artificial Intelligence

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter , 5.7,5.8

Playing Games. Henry Z. Lo. June 23, We consider writing AI to play games with the following properties:

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Adversarial Search (Game Playing)

For slightly more detailed instructions on how to play, visit:

Introduction to Artificial Intelligence CS 151 Programming Assignment 2 Mancala!! Due (in dropbox) Tuesday, September 23, 9:34am

UNIT 13A AI: Games & Search Strategies. Announcements

Game-playing AIs: Games and Adversarial Search I AIMA

The game of Reversi was invented around 1880 by two. Englishmen, Lewis Waterman and John W. Mollett. It later became

Rules of the game. chess checkers tic-tac-toe...

CS2212 PROGRAMMING CHALLENGE II EVALUATION FUNCTIONS N. H. N. D. DE SILVA

CPS331 Lecture: Agents and Robots last revised November 18, 2016

CS 1571 Introduction to AI Lecture 12. Adversarial search. CS 1571 Intro to AI. Announcements

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Experiments on the mechanization of game-learning Part I. Characterization of the model and its parameters

CS 188: Artificial Intelligence

CMPUT 396 Tic-Tac-Toe Game

mywbut.com Two agent games : alpha beta pruning

ADVERSARIAL SEARCH. Today. Reading. Goals. AIMA Chapter Read , Skim 5.7

CS 5522: Artificial Intelligence II

Game Variations: Ultimate Tic Tac Toe

Game Tree Search. CSC384: Introduction to Artificial Intelligence. Generalizing Search Problem. General Games. What makes something a game?

Adversarial Search 1

CSC 110 Lab 4 Algorithms using Functions. Names:

CS 188: Artificial Intelligence. Overview

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Artificial Intelligence

Announcements. Homework 1 solutions posted. Test in 2 weeks (27 th ) -Covers up to and including HW2 (informed search)

CPS331 Lecture: Search in Games last revised 2/16/10

Computer Science and Software Engineering University of Wisconsin - Platteville. 4. Game Play. CS 3030 Lecture Notes Yan Shi UW-Platteville

Tic-Tac-Toe and machine learning. David Holmstedt Davho G43

The Mathematics of Playing Tic Tac Toe

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

COMP9414: Artificial Intelligence Problem Solving and Search

Artificial Intelligence Lecture 3

CS360: AI & Robotics. TTh 9:25 am - 10:40 am. Shereen Khoja 8/29/03 CS360 AI & Robotics 1

Rules of the game. chess checkers tic-tac-toe...

Plan. Related courses. A Take-Away Game. Mathematical Games , (21-801) - Mathematical Games Look for it in Spring 11

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46.

COMP219: Artificial Intelligence. Lecture 13: Game Playing

16.410/413 Principles of Autonomy and Decision Making

CPS331 Lecture: Intelligent Agents last revised July 25, 2018

Artificial Intelligence

6. Games. COMP9414/ 9814/ 3411: Artificial Intelligence. Outline. Mechanical Turk. Origins. origins. motivation. minimax search

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Intro to Java Programming Project

Unit 12: Artificial Intelligence CS 101, Fall 2018

UNIT 13A AI: Games & Search Strategies

Artificial Intelligence Adversarial Search

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

MITOCW Project: Backgammon tutor MIT Multicore Programming Primer, IAP 2007

Aja Huang Cho Chikun David Silver Demis Hassabis. Fan Hui Geoff Hinton Lee Sedol Michael Redmond

Adversarial Search: Game Playing. Reading: Chapter

Artificial Intelligence 1: game playing

Board Game AIs. With a Focus on Othello. Julian Panetta March 3, 2010

Games and Adversarial Search

CS 188: Artificial Intelligence Spring Game Playing in Practice

BOXES: AN EXPERIMENT IN ADAPTIVE CONTROL

CSE 573: Artificial Intelligence Autumn 2010

CS 771 Artificial Intelligence. Adversarial Search

game tree complete all possible moves

Five-In-Row with Local Evaluation and Beam Search

COS 402 Machine Learning and Artificial Intelligence Fall Lecture 1: Intro

Computer Game Programming Board Games

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

CPS331 Lecture: Agents and Robots last revised April 27, 2012

Artificial Intelligence

Adversarial Search. Rob Platt Northeastern University. Some images and slides are used from: AIMA CS188 UC Berkeley

Adversarial Search. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley 2. RN, AIMA

2359 (i.e. 11:59:00 pm) on 4/16/18 via Blackboard

Programming Project 1: Pacman (Due )

Game Playing State-of-the-Art

Math 152: Applicable Mathematics and Computing

Ar#ficial)Intelligence!!

CS885 Reinforcement Learning Lecture 13c: June 13, Adversarial Search [RusNor] Sec

Reinforcement Learning in Games Autonomous Learning Systems Seminar

Transcription:

CS 4700: Foundations of Artificial Intelligence Bart Selman Reinforcement Learning R&N Chapter 21 Note: in the next two parts of RL, some of the figure/section numbers refer to an earlier edition of R&N with a more basic description of the techniques. The slides provide a self-contained description.

Reinforcement Learning In our discussion of Search methods (developed for problem solving), we assumed a given State Space and operators that lead from one State to one or more Successor states with a possible operator Cost. The State space can be exponentially large but is in principle Known. The difficulty was finding the right path (sequence of moves). This problem solved by searching through the various alternative sequences of moves. In tough spaces, this leads to exponential searches. Can we do something totally different?? Avoid search

Why don t we just learn how to make the right move in each possible state? In principle, need to know very little about environment at the start. Simply observe another agent / human / program make steps (go from state to state) and mimic! Reinforcement learning: Some of the earliest AI research (1960s). It works! Principles and ideas still applicable today.

Environment we consider is a basic game (the simplest non-trivial game): Tic-Tac-Toe The question: Can you write a program that learns to play Tic-Tac-Toe? Let s try to re-discover what Donald Michie did in 1962. He did not even use a computer! He handsimulated one. The first non-trivial machine learning program!

Tic-tac-toe (or Noughts and crosses, Xs and Os) We start 3 moves per player in: X s turn Now, we don t want 3x3 Tic-Tac-Toe optimal play O s turn X loss loss Bart Selman CS4700 5

What else can we think of? Basic ingredients needed: 1) We need to represent board states. 2) What moves to make in different states. It may help to think a bit probabilistically pick moves with some probability and adjust probabilities through a learning procedure

Learn from human opponent We could try to learn directly from human what moves to make But, some issues: 1) Human may be a weak player. J We want to learn how to beat him/her! 2) Human may play nought (second player) and computer wants to learn how to play cross (first player). Answer: Let s try to just play human against machine and learn something from wins and losses.

To start: some basics of the machine For each board state where cross is on-move, have a match box labeled with that state. Requires a few hundred matchboxes.

Each match box has a number of colored beads in it, each color represents a valid move for cross on that board. E.g. start with ten beads of each color for each valid move. 1) To make a move, pick up box with label of current state, shake it, Pick random bead. Check color and make that move. 2) New state, wait for human counter-move. New state, repeat above.

Game ends when one of the parties has a win / loss or no more open spaces. This is how the machine plays. How well will it play? What is is doing initially? Machine needs to learn! How? Can you think of a strategy? The first successful machine learning program in history (not involving search) Let s try to come up with a strategy What do we need to do?

Reinforcement Learning

Works!!! J Don t need that many games. Quite surprising! Reinforcement Learning

Learning in this case took advantage of : Comments 1) State space is manageable. Further reduced by using 1 state to represent all isomorphic states (through board rotations and symmetries). We quietly encoded some knowledge about tic-tac-toe!

2) What if state space is MUCH larger? As for any interesting game Options: a) Represent board by features. I.e., number of various pieces on chess board but not their position.. It s like having each matchbox represent a large collection of states. Notion of valid moves becomes a bit trickier. b) Don t store match boxes / states explicitly, instead learn a function (e.g. neural net) that computes the right move directly when given some representation of the state as input. c) Combination of a) and b). d) Combine a), b), and c) with some form of look-ahead search.