How AI Won at Go and So What? Garry Kasparov vs. Deep Blue (1997)

Similar documents
Game-playing: DeepBlue and AlphaGo

Andrei Behel AC-43И 1

Monte Carlo Tree Search

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46.

CSC321 Lecture 23: Go

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón

Intelligent Non-Player Character with Deep Learning. Intelligent Non-Player Character with Deep Learning 1

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta

Mastering Chess and Shogi by Self- Play with a General Reinforcement Learning Algorithm

CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

AI in Tabletop Games. Team 13 Josh Charnetsky Zachary Koch CSE Professor Anita Wasilewska

Foundations of Artificial Intelligence Introduction State of the Art Summary. classification: Board Games: Overview

Game Playing: Adversarial Search. Chapter 5

Adversarial Search: Game Playing. Reading: Chapter

Artificial Intelligence. Minimax and alpha-beta pruning

Foundations of AI. 6. Board Games. Search Strategies for Games, Games with Chance, State of the Art

CS 387: GAME AI BOARD GAMES. 5/24/2016 Instructor: Santiago Ontañón

TTIC 31230, Fundamentals of Deep Learning David McAllester, April AlphaZero

School of EECS Washington State University. Artificial Intelligence

AlphaGo and Artificial Intelligence GUEST LECTURE IN THE GAME OF GO AND SOCIETY

CS 188: Artificial Intelligence

CS 331: Artificial Intelligence Adversarial Search II. Outline

Adversarial Search (Game Playing)

SDS PODCAST EPISODE 110 ALPHAGO ZERO

Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN

Adversarial Search. Soleymani. Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 5

Game AI Challenges: Past, Present, and Future

Success Stories of Deep RL. David Silver

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Programming Project 1: Pacman (Due )

Artificial Intelligence Search III

Decision Making in Multiplayer Environments Application in Backgammon Variants

Foundations of Artificial Intelligence

Lecture 5: Game Playing (Adversarial Search)

Game Playing AI Class 8 Ch , 5.4.1, 5.5

Foundations of Artificial Intelligence

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

Adversarial Search and Game Playing. Russell and Norvig: Chapter 5

CS 188: Artificial Intelligence

TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

Unit-III Chap-II Adversarial Search. Created by: Ashish Shah 1

Experiments with Tensor Flow Roman Weber (Geschäftsführer) Richard Schmid (Senior Consultant)

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

CS 5522: Artificial Intelligence II

Contents. Foundations of Artificial Intelligence. Problems. Why Board Games?

Adversarial Search Aka Games

Games and Adversarial Search

UNIT 13A AI: Games & Search Strategies

Game Playing. Philipp Koehn. 29 September 2015

CS 440 / ECE 448 Introduction to Artificial Intelligence Spring 2010 Lecture #5

Game-Playing & Adversarial Search

Game Playing. Garry Kasparov and Deep Blue. 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.

Quick work: Memory allocation

Department of Computer Science and Engineering. The Chinese University of Hong Kong. Final Year Project Report LYU1601

CSE 473: Artificial Intelligence. Outline

Artificial Intelligence Adversarial Search

CS 4700: Foundations of Artificial Intelligence

CS 387/680: GAME AI BOARD GAMES

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016 Tim French

Game playing. Chapter 6. Chapter 6 1

Poker AI: Equilibrium, Online Resolving, Deep Learning and Reinforcement Learning

CS 771 Artificial Intelligence. Adversarial Search

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

COMP219: COMP219: Artificial Intelligence Artificial Intelligence Dr. Annabel Latham Lecture 12: Game Playing Overview Games and Search

Game playing. Outline

Adversarial Search Lecture 7

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

Artificial Intelligence

Automated Suicide: An Antichess Engine

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

CPS 570: Artificial Intelligence Two-player, zero-sum, perfect-information Games

Game Playing. Why do AI researchers study game playing? 1. It s a good reasoning problem, formal and nontrivial.

Game Playing AI. Dr. Baldassano Yu s Elite Education

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Games CSE 473. Kasparov Vs. Deep Junior August 2, 2003 Match ends in a 3 / 3 tie!

Game Playing State-of-the-Art

Monte Carlo Tree Search. Simon M. Lucas

CS 380: ARTIFICIAL INTELLIGENCE ADVERSARIAL SEARCH. Santiago Ontañón

Bootstrapping from Game Tree Search

Adversarial Search and Game- Playing C H A P T E R 6 C M P T : S P R I N G H A S S A N K H O S R A V I

Game playing. Chapter 5. Chapter 5 1

UNIT 13A AI: Games & Search Strategies. Announcements

Games vs. search problems. Game playing Chapter 6. Outline. Game tree (2-player, deterministic, turns) Types of games. Minimax

Today. Types of Game. Games and Search 1/18/2010. COMP210: Artificial Intelligence. Lecture 10. Game playing

Game playing. Chapter 6. Chapter 6 1

Artificial Intelligence

CS 380: ARTIFICIAL INTELLIGENCE

Outline. Game playing. Types of games. Games vs. search problems. Minimax. Game tree (2-player, deterministic, turns) Games

Foundations of AI. 5. Board Games. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard and Luc De Raedt SA-1

Intuition Mini-Max 2

CS 4700: Foundations of Artificial Intelligence

Lecture 1 What is AI? EECS 348 Intro to Artificial Intelligence Doug Downey

LONDON S BEST BUSINESS MINDS TO COMPETE FOR PRESTIGIOUS CHESS TITLE

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

COMP219: Artificial Intelligence. Lecture 13: Game Playing

Presentation Overview. Bootstrapping from Game Tree Search. Game Tree Search. Heuristic Evaluation Function

AI, AlphaGo and computer Hex

Lecture 33: How can computation Win games against you? Chess: Mechanical Turk

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009

Transcription:

How AI Won at Go and So What? Garry Kasparov vs. Deep Blue (1997) Alan Fern School of Electrical Engineering and Computer Science Oregon State University Deep Mind s vs. Lee Sedol (2016) Watson vs. Ken Jennings (2011) Computer Go A Brief History of Computer Go 1997: Super human Chess w/ Alpha-Beta + Fast Computer 2005: Computer Go is impossible! Why? 9x9 (smallest board) 19x19 (standard board) Task Par Excellence for AI (Hans Berliner) New Drosophila of AI (John McCarthy) Grand Challenge Task (David Mechner) Branching Factor Chess 35 Go 250 Required search depth Chess 14 Go much larger Lookahead Tree...... evaluation = 0.7 MiniMax Tree Leaf Evaluation Function Chess good hand-coded function Go no good hand-coded function 1

A Brief History of Computer Go 1997: Super human Chess w/ Alpha-Beta + Fast Computer 2005: Computer Go is impossible! 2006: Monte-Carlo Tree Search applied to 9x9 Go (bit of learning) 2007: Human master level achieved at 9x9 Go (bit more learning) 2008: Human grandmaster level achieved at 9x9 Go (even more) Computer GO Server rating over this period: 1800 ELO 2600 ELO 2012: Zen program beats former international champion Takemiya Masaki with only 4 stone handicap in 19x19 2015: DeepMind s Defeats European Champion 5-0 (lots of learning) Deep Learning + + HPC Learn from 30 million expert moves and self play Highly parallel search implementation 48 CPUs, 8 GPUs (scaling to 1,202 CPUs, 176 GPUs) March 2016 : beats Lee Sedol 4-1 8 Arsenal of Arsenal of 9 10 Idea #1: board evaluation function via random rollouts Idea #2: selective tree expansion Evaluation Function: - play many random games - evaluation is fraction of games won by current player - surprisingly effective Even better if use rollouts that select better than random moves Non-uniform tree growth 2

Idea #2: selective tree expansion Idea #2: non-uniform tree expansion rollout How can we do better? Arsenal of Learning to Predict Good Moves 15 Idea: treat Go board as an image use modern computer vision How can you write a program to distinguish cats from dogs in images? State-of-the-Art Performance: very fast GPU implementations allow training giant networks (millions of parameters) on massive data sets Machine Learning: show computer example cats and dogs and let it decide how to distinguish them Deep Neural Network Deep Neural Network cat dog cat dog 3

Arsenal of State-of-the-Art Performance: very fast GPU implementations allow training giant networks (millions of parameters) on massive data sets Could a Deep NN learn to predict expert Go moves by looking at board position? Yes! Deep Neural Network Go Move 20 for Go Output: probability of each move for Go Output: probability of each move being played by an expert leading to a win Input: Board Position Deep NN Internal Layers Trained for 3 weeks on 30 million expert moves 57% prediction accuracy! Input: Board Position has still not played a game of Go! Could it improve further by playing? Arsenal of : learn to act well in an environment via trial-and-error that results in positive and negative rewards Observations & Reward Action Practice Environment 23 4

TD-Gammon (1992) Learning from Self Play Backgammon Neural network with 80 hidden units (1 layer) Used for 1.5 Million games of self-play One of the top (2 or 3) players in the world! 25 : learn from positive and negative rewards (win = +1 and loss = -1 in Go) 26 for Go Output: probability of each move Input: Board Position Start with Deep NN from supervised learning. Continue to train network via self play. did this for months. 80% win rate against the original supervised Deep NN 85% win rate against best prior tree search method! Still not close to professional level Problem: takes too long long to evaluate (msec per board) Solution: use smaller networks (less accurate but fast) 5

Deep Learning + + HPC Learn from 30 million expert moves and self play Highly parallel search implementation 48 CPUs, 8 GPUs (scaling to 1,202 CPUs, 176 GPUs) Solution: use smaller networks (less accurate but fast) Use expensive network to guide tree expansion 2015 : beats European Champ (5-0) lots of self play March 2016 : beats Lee Sedol (4-1) 32 Computers are good at Go now So What? Computers are good at Go now So What? Emergency response Forest Fire Management Species Conservation Smart Grids... The idea of combining search with learning is very general and widely applicable Multi-Domain Simulator Optimization & Search High Performance Machine Learning Deep Networks are leading to advances in many areas of AI now Computer Vision Speech Processing Natural Language Processing Bioinformatics Robotics Human-Computer Interaction Rational Decision Making It is a very exciting time to be working in AI 34 6