AlphaGo and Artificial Intelligence GUEST LECTURE IN THE GAME OF GO AND SOCIETY

Similar documents
Game-playing: DeepBlue and AlphaGo

Andrei Behel AC-43И 1

CS 380: ARTIFICIAL INTELLIGENCE MONTE CARLO SEARCH. Santiago Ontañón

Monte Carlo Tree Search

TTIC 31230, Fundamentals of Deep Learning David McAllester, April AlphaZero

Mastering Chess and Shogi by Self- Play with a General Reinforcement Learning Algorithm

Google DeepMind s AlphaGo vs. world Go champion Lee Sedol

How AI Won at Go and So What? Garry Kasparov vs. Deep Blue (1997)

CSC321 Lecture 23: Go

46.1 Introduction. Foundations of Artificial Intelligence Introduction MCTS in AlphaGo Neural Networks. 46.

Computer Go: from the Beginnings to AlphaGo. Martin Müller, University of Alberta

Game Playing State-of-the-Art CSE 473: Artificial Intelligence Fall Deterministic Games. Zero-Sum Games 10/13/17. Adversarial Search

By David Anderson SZTAKI (Budapest, Hungary) WPI D2009

Success Stories of Deep RL. David Silver

SDS PODCAST EPISODE 110 ALPHAGO ZERO

Aja Huang Cho Chikun David Silver Demis Hassabis. Fan Hui Geoff Hinton Lee Sedol Michael Redmond

Experiments with Tensor Flow Roman Weber (Geschäftsführer) Richard Schmid (Senior Consultant)

Mastering the game of Go without human knowledge

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

AI in Tabletop Games. Team 13 Josh Charnetsky Zachary Koch CSE Professor Anita Wasilewska

CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

CSE 473: Artificial Intelligence. Outline

Artificial Intelligence Adversarial Search

The Principles Of A.I Alphago

CS 188: Artificial Intelligence

Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN

CSE 40171: Artificial Intelligence. Adversarial Search: Games and Optimality

Adversarial Search Lecture 7

6. Games. COMP9414/ 9814/ 3411: Artificial Intelligence. Outline. Mechanical Turk. Origins. origins. motivation. minimax search

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence

COMS 493 AI, ROBOTS & COMMUNICATION

Games and Adversarial Search

Foundations of Artificial Intelligence Introduction State of the Art Summary. classification: Board Games: Overview

Playing Othello Using Monte Carlo

Game Playing: Adversarial Search. Chapter 5

CITS3001. Algorithms, Agents and Artificial Intelligence. Semester 2, 2016 Tim French

Programming Project 1: Pacman (Due )

Last update: March 9, Game playing. CMSC 421, Chapter 6. CMSC 421, Chapter 6 1

AI, AlphaGo and computer Hex

DIT411/TIN175, Artificial Intelligence. Peter Ljunglöf. 2 February, 2018

Adversarial Search. Read AIMA Chapter CIS 421/521 - Intro to AI 1

Quick work: Memory allocation

Game AI Challenges: Past, Present, and Future

Outline. Game Playing. Game Problems. Game Problems. Types of games Playing a perfect game. Playing an imperfect game

Poker AI: Equilibrium, Online Resolving, Deep Learning and Reinforcement Learning

History and Philosophical Underpinnings

Decision Making in Multiplayer Environments Application in Backgammon Variants

Computing Science (CMPUT) 496

School of EECS Washington State University. Artificial Intelligence

COMP219: Artificial Intelligence. Lecture 2: AI Problems and Applications

Announcements. Homework 1. Project 1. Due tonight at 11:59pm. Due Friday 2/8 at 4:00pm. Electronic HW1 Written HW1

Intelligent Non-Player Character with Deep Learning. Intelligent Non-Player Character with Deep Learning 1

Artificial Intelligence. Minimax and alpha-beta pruning

Data-Starved Artificial Intelligence

CS 188: Artificial Intelligence

CS 771 Artificial Intelligence. Adversarial Search

Random Administrivia. In CMC 306 on Monday for LISP lab

Learning from Hints: AI for Playing Threes

Reinforcement Learning for CPS Safety Engineering. Sam Green, Çetin Kaya Koç, Jieliang Luo University of California, Santa Barbara

Foundations of AI. 6. Adversarial Search. Search Strategies for Games, Games with Chance, State of the Art. Wolfram Burgard & Bernhard Nebel

Learning to play Dominoes

ECE 517: Reinforcement Learning in Artificial Intelligence

Using Artificial intelligent to solve the game of 2048

Game-Playing & Adversarial Search

CS6700: The Emergence of Intelligent Machines. Prof. Carla Gomes Prof. Bart Selman Cornell University

CSE 573: Artificial Intelligence

Analyzing the Impact of Knowledge and Search in Monte Carlo Tree Search in Go

Prof. Sameer Singh CS 175: PROJECTS IN AI (IN MINECRAFT) WINTER April 6, 2017

A Complex Systems Introduction to Go

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

CS 5522: Artificial Intelligence II

MyPawns OppPawns MyKings OppKings MyThreatened OppThreatened MyWins OppWins Draws

Lecture 5: Game Playing (Adversarial Search)

More on games (Ch )

CS 4700: Foundations of Artificial Intelligence

Artificial Intelligence. Topic 5. Game playing

What is Artificial Intelligence? Alternate Definitions (Russell + Norvig) Human intelligence

Artificial Intelligence

CS-E4800 Artificial Intelligence

Department of Computer Science and Engineering. The Chinese University of Hong Kong. Final Year Project Report LYU1601

CSE 573: Artificial Intelligence Autumn 2010

Improving MCTS and Neural Network Communication in Computer Go

Monte Carlo Tree Search. Simon M. Lucas

CS 387: GAME AI BOARD GAMES

CS 188: Artificial Intelligence Spring Game Playing in Practice

Adversarial Search. CMPSCI 383 September 29, 2011

LONDON S BEST BUSINESS MINDS TO COMPETE FOR PRESTIGIOUS CHESS TITLE

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar

Introduction to Artificial Intelligence

University ROBOTICS AND THE FUTURE OF JOBS. Student s Name and Surname. Course. Professor. Due Date

LONDON S BEST BUSINESS MINDS TO COMPETE FOR PRESTIGIOUS CHESS TITLE

Adversarial Search. CS 486/686: Introduction to Artificial Intelligence

Demystifying Machine Learning

Training a Back-Propagation Network with Temporal Difference Learning and a database for the board game Pente

Applications of Artificial Intelligence and Machine Learning in Othello TJHSST Computer Systems Lab

Game playing. Chapter 5, Sections 1 6

Foundations of Artificial Intelligence

43.1 Introduction. Foundations of Artificial Intelligence Introduction Monte-Carlo Methods Monte-Carlo Tree Search. 43.

Transcription:

AlphaGo and Artificial Intelligence HUCK BENNET T (NORTHWESTERN UNIVERSITY) GUEST LECTURE IN THE GAME OF GO AND SOCIETY AT OCCIDENTAL COLLEGE, 10/29/2018

The Game of Go A game for aliens, presidents, and gods. if intelligent life forms exist elsewhere in the universe they almost certainly play Go. Chess master Edward Lasker. I learned to play Go in college... It's a very complicated game... non-linear. President Barack Obama. I'm making the universe. It's like I'm a God. I'm going to become a God! On this Go Board. Hikaru, in the Hikaru no Go anime. A grand challenge for artificial intelligence to beat a top professional. No program came close before AlphaGo.

Alternate Views Scientific achievement? Master of Go? Harbinger of automation?

Computer Go is hard There are many possible games of Go: Number of Go games: 10 511. Number of chess games: 10 120. Number of atoms in the universe: 10 80. The branch factor is high: 200 possible moves at each turn in Go. 20 possible moves at each turn in chess. Hard for brute-force search! Non-locality: The best move in a game of Go may be far away from the previous move.

Why now? Why the explosive progress of (Go) AI in the last 10 years? Lots of computing power. Huge networks of computers (i.e. server farms). Specialized hardware: GPUs, TPUs. Lots of data. User-generated content, social media. Ubiquitous sensors. The main algorithms are not new. Neural networks [McCulloch and Pitts 1943, Rosenblatt 1957]. (Stochastic) gradient descent [Cauchy 1847, Robbins and Munro 1951]. Reinforcement learning [Sutton 1984]. Backpropagation [Linnainmaa 1970]. Monte Caro Tree Search [Abramson 1987].

The Algorithms behind AlphaGo 1. Deep neural networks. A policy network used to predict which moves are most likely to be played. A value network used to predict how likely a move is to result in a win. 2. Reinforcement learning. AlphaGo Zero only uses reinforcement learning to train its networks. 3. Monte Carlo Tree Search (MCTS). A different way to predict how likely a move is to result in a win. Image source: Silver, David et al. Mastering the game of Go with deep neural networks and tree search.

The Perceptron: A 1-node Neural Network Inputs: x 1, x 2 which are either 0 or 1. The weights w 0, w 1, w 2 are numbers that are set while training the neural network, but which don t change when evaluating it. Output: A 1 if w 1 x 1 + w 2 x 2 w 0 and 0 otherwise. x 1 w 1 w 1 x 1 + w 2 x 2 w 0? 1 if yes 0 if no Example: AND(x 1, x 2 ). w 1 = w 2 = 1, w 0 = 2. Example: NOT(x 1 ). w 1 = 1, w 2 = 0, w 0 = 0. w 2 x 2

(Deep) Neural Networks Image source: https://twitter.com/gp_pulipaka/status/944590018957529088

Reinforcement Learning Reinforcement learning = learning by playing itself for games. Learn given only a state space, a set of actions and a reward function. State space: Possible Go board configurations. Actions: Legal Go moves. Reward: Does playing the move result in a win or a loss? Reinforcement learning does not use human-generated data. The first version of AlphaGo trained on some human games and then used reinforcement learning. AlphaGo Zero and AlphaZero used only reinforcement learning, and no human games or knowledge. Image source: https://skymind.ai/wiki/deep-reinforcement-learning

Monte Carlo Tree Search (MCTS) The basic algorithm: 1. Play a candidate move. 2. For each candidate move from Step (1), play out 10000 games all the way to the end randomly, and record in how many of the games black won. 3. Select the move that resulted in the most wins in Step (2). Improvements: Store win rates for variations considered in Step (2) deeper in the search tree. Use these to bias further playouts in terms of exploitation and expansion. Image source: Sensei s library.

AlphaGo is strong October 2015: Defeated Fan Hui 2P, European champion, 5-0. March 2016: Defeated Lee Sedol 9P, 18-time world champion, 4-1. December 2016 January 2017: Defeated many top players in online games, 60-0. May 2017: Defeated Ke Jie 9P, ranked #1 in the world at the time, 3-0.

Where does that leave us? signalled the birth of a new age an age of computers able to resolve specifically humanistic problems. Lockhart told me that his heart really sank at the news of AlphaGo s success. Go, he said, was supposed to be the the one game computers can t beat humans at. It s the one. From The New Yorker.

What s bad about better AI? The title of a slide from a presentation by Stuart Russell. Ethical issues related to AI: 1. Weaponization. 2. Accountability. 3. Propagation of human bias. 4. Automation of work.

Killer robots?

More likely Study by Frey and Osbourne ( 2700 citations): 47% of American jobs will be automated in the near future: Most likely: Telemarketers (99% chance). Least likely: Recreational therapists (0.3% chance). In the middle: Computer programmers (48% chance). Go players?

You are obsolete

A challenge for the 21 st century How to handle the automation of work? 1. Retraining programs. Everyone should learn how to program. 2. Universal income. Editorial comment: These things aren t enough. Image source: memegenerator.net.

All true at once Scientific achievement Master of Go Harbinger of automation

References This talk is based on my article AlphaGo and Artificial Intelligence from March 2016. Available at https://hdbennett.wordpress.com/2016/03/18/alphago-and-artificial-intelligence/. The AlphaGo papers: Silver, David et al. Mastering the game of Go with deep neural networks and tree search. Nature, Volume 529, pages 484 489 (28 January 2016). Silver, David et al. Mastering the game of Go without human knowledge. Nature, Volume 550, pages 354 359 (19 October 2017). Silver, David et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. Available at https://arxiv.org/abs/1712.01815. Frey, Carl Benedikt and Michael A. Osborne The Future of Employment: How Susceptible are Jobs to Computerisation? Technological Forecasting and Social Change, Volume 114, January 2017, Pages 254-280.