The Game-Theoretic Approach to Machine Learning and Adaptation

Similar documents
Kernels and Support Vector Machines

CS188 Spring 2010 Section 3: Game Trees

Signal Recovery from Random Measurements

Game Theory and Randomized Algorithms

Alternation in the repeated Battle of the Sexes

CS188 Spring 2010 Section 3: Game Trees

Game Theory: The Basics. Theory of Games and Economics Behavior John Von Neumann and Oskar Morgenstern (1943)

Chapter 2 Distributed Consensus Estimation of Wireless Sensor Networks

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab. 김강일

CS510 \ Lecture Ariel Stolerman

Learning Permutations with Exponential Weights

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Algorithms and Game Theory Date: 12/4/14

Learning via Delayed Knowledge A Case of Jamming. SaiDhiraj Amuru and R. Michael Buehrer

From Wireless Network Coding to Matroids. Rico Zenklusen

Chapter 2 Channel Equalization

SSB Debate: Model-based Inference vs. Machine Learning

Instability of Scoring Heuristic In games with value exchange, the heuristics are very bumpy Make smoothing assumptions search for "quiesence"

Is everything stochastic?

Contents. MA 327/ECO 327 Introduction to Game Theory Fall 2017 Notes. 1 Wednesday, August Friday, August Monday, August 28 6

Graph-of-word and TW-IDF: New Approach to Ad Hoc IR (CIKM 2013) Learning to Rank: From Pairwise Approach to Listwise Approach (ICML 2007)

Building a Computer Mahjong Player Based on Monte Carlo Simulation and Opponent Models

Optimizing Media Access Strategy for Competing Cognitive Radio Networks Y. Gwon, S. Dastangoo, H. T. Kung

Monte Carlo Tree Search and AlphaGo. Suraj Nair, Peter Kundzicz, Kevin An, Vansh Kumar

Convolutional Networks Overview

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Algorithmic Game Theory Date: 12/6/18

Lecture 3 - Regression

More on games (Ch )

37 Game Theory. Bebe b1 b2 b3. a Abe a a A Two-Person Zero-Sum Game

CSCI 699: Topics in Learning and Game Theory Fall 2017 Lecture 3: Intro to Game Theory. Instructor: Shaddin Dughmi

Machine Learning for Language Technology

Bandit Algorithms Continued: UCB1

Anavilhanas Natural Reserve (about 4000 Km 2 )

Hamming Codes as Error-Reducing Codes

Learning Artificial Intelligence in Large-Scale Video Games

Convergence in competitive games

Multiple Tree for Partially Observable Monte-Carlo Tree Search

Chapter 2 Basics of Game Theory

Upper Confidence Trees with Short Term Partial Information

TTIC 31230, Fundamentals of Deep Learning David McAllester, April AlphaZero

Generalized Game Trees

More on games (Ch )

Voice Activity Detection

Game-playing: DeepBlue and AlphaGo

Computing Elo Ratings of Move Patterns. Game of Go

Stacking Ensemble for auto ml

Optimal Rhode Island Hold em Poker

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi

Modeling, Analysis and Optimization of Networks. Alberto Ceselli

Outline. Introduction to AI. Artificial Intelligence. What is an AI? What is an AI? Agents Environments

Interconnect. Physical Entities

The game of Bridge: a challenge for ILP

Comparing UCT versus CFR in Simultaneous Games


Markov Chains in Pop Culture

Game Theory Lecturer: Ji Liu Thanks for Jerry Zhu's slides

CS 4700: Artificial Intelligence

Optimization Techniques for Alphabet-Constrained Signal Design

Decoding Turbo Codes and LDPC Codes via Linear Programming

What is... Game Theory? By Megan Fava

ON THE PERMUTATIONAL POWER OF TOKEN PASSING NETWORKS.

CRYPTOSHOOTER MULTI AGENT BASED SECRET COMMUNICATION IN AUGMENTED VIRTUALITY

LDPC Decoding: VLSI Architectures and Implementations

Transport Capacity and Spectral Efficiency of Large Wireless CDMA Ad Hoc Networks

Some results on optimal estimation and control for lossy NCS. Luca Schenato

Classifier-Based Approximate Policy Iteration. Alan Fern

From Fountain to BATS: Realization of Network Coding

Adversarial Examples and Adversarial Training. Ian Goodfellow, OpenAI Research Scientist Presentation at Quora,

CS 387: GAME AI BOARD GAMES

MAGNT Research Report (ISSN ) Vol.6(1). PP , Controlling Cost and Time of Construction Projects Using Neural Network

Deep Neural Networks (2) Tanh & ReLU layers; Generalisation and Regularisation

Policy Teaching. Through Reward Function Learning. Haoqi Zhang, David Parkes, and Yiling Chen

EXACT SIGNAL RECOVERY FROM SPARSELY CORRUPTED MEASUREMENTS

Optimizing Client Association in 60 GHz Wireless Access Networks

Toward Non-stationary Blind Image Deblurring: Models and Techniques

WorldQuant. Perspectives. Welcome to the Machine

CS221 Project Final Report Gomoku Game Agent

Using Fictitious Play to Find Pseudo-Optimal Solutions for Full-Scale Poker

Game Playing AI Class 8 Ch , 5.4.1, 5.5

Japanese. Sail North. Search Search Search Search

Dota2 is a very popular video game currently.

From a Ball Game to Incompleteness

Adversarial Search (Game Playing)

A Bandit Approach for Tree Search

game tree complete all possible moves

Dynamic Throttle Estimation by Machine Learning from Professionals

CPS331 Lecture: Intelligent Agents last revised July 25, 2018

Adversary Search. Ref: Chapter 5

Introduction to Spring 2009 Artificial Intelligence Final Exam

1\2 L m R M 2, 2 1, 1 0, 0 B 1, 0 0, 0 1, 1

EE359 Discussion Session 8 Beamforming, Diversity-multiplexing tradeoff, MIMO receiver design, Multicarrier modulation

Efficiency and detectability of random reactive jamming in wireless networks

Probability. March 06, J. Boulton MDM 4U1. P(A) = n(a) n(s) Introductory Probability

From ProbLog to ProLogic

Training a Minesweeper Solver

Overview GAME THEORY. Basic notions

Retrieval of Large Scale Images and Camera Identification via Random Projections

Self-Organising, Open and Cooperative P2P Societies From Tags to Networks

Multiplayer Pushdown Games. Anil Seth IIT Kanpur

Modeling the Dynamics of Coalition Formation Games for Cooperative Spectrum Sharing in an Interference Channel

CPS331 Lecture: Genetic Algorithms last revised October 28, 2016

Transcription:

The Game-Theoretic Approach to Machine Learning and Adaptation Nicolò Cesa-Bianchi Università degli Studi di Milano Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 1 / 25

Machine Learning A wide range of applications Categorization of documents, speech, images, genes Natural language processing Robot control Search engine quality Dynamic allocation of resources Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 2 / 25

Learning theory Foundations of machine learning Under what conditions can a machine learn from examples? How much information (e.g., training examples) is needed to achieve a given predictive performance? How many computational resources (time and space)? What is the best mathematical framework to study these phenomena? Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 3 / 25

The statistical learning vision The training data are a statistical sample (i.i.d.) Relate the empirical error of a predictor to its true error rate A finite-sample estimation problem Vladimir Vapnik Overfitting The best predictor on the data is not guaranteed to have a small error rate if it is chosen from a large set Need enough data to guarantee that empirical error is close to true error for each predictor in the set This enough turns out to depend on a notion of combinatorial dimension of the set of (VC dimension) Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 4 / 25

The need for a different vision The statistical approach is at the basis of the most successful applications of machine learning in the past twenty years As the range of machine learning applications widens, new paradigms are needed Some hard cases for statistical modelling Data source is highly nonstationary Environment reacts to the learner (e.g., spam) On a more philosophical level Is statistics the only language for describing the phenomenon of learning in machines? Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 5 / 25

Theory of repeated games James Hannan David Blackwell Learning to play a game (1956) Play a game repeatedly against a possibly suboptimal opponent Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 6 / 25

Zero-sum 2-person games played more than once 1 2... M 1 l(1, 1) l(1, 2)... 2 l(2, 1) l(2, 2)......... N N M known loss matrix Row player (player) has N actions Column player (opponent) has M actions For each game round t = 1, 2,... Player chooses action i t and opponent chooses action y t The player suffers loss l(i t, y t ) (= gain of opponent) Player can learn from opponent s history of past choices y 1,..., y t 1 Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 7 / 25

Prediction with expert advice Volodya Vovk Manfred Warmuth Opponent s moves y 1, y 2,... define a sequential prediction problem with loss function l 1 Play action I t from 1,..., N 2 Observe next value y t 3 Incur loss l(i t, y t ) Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 8 / 25

Exponentially weighted forecaster At time t pick action i with probability proportional to exp ( η Loss i,t ) where Loss i,t is total loss of action i up to now Expert s theorem The average per-round expected loss of the forecaster converges to that of the best action for the observed sequence at rate ln N where N is number of actions and T is the number of time steps Note: no dependence on number of opponent s actions T Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 9 / 25

The bandit problem: playing an unknown game In order to keep counts Loss i,t for each action, we need to know the losses l(i, y t ) also for the actions i we did not play at round t What if we can only observe the loss of the played action I t?... N slot machines Dynamic content optimization Surprisingly, convergence rate to best action is N ln N T Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 10 / 25

Structured actions: adversarial routing In certain problems, actions have a combinatorial structure (paths, trees, matchings) If loss is linear over the edges, then the bandit convergence rate to best action is d ln N T where d is number of edges and N is the number of actions (typically superpolynomial in d) Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 11 / 25

Partial monitoring: not observing any loss Dynamic pricing 1 Post a T-shirt price 2 Observe if next customer buys or not 3 Adjust price Note: feedback does not reveal the player s loss Goal: converge to the average return of the best fixed price Convergence rate to best fixed price is T 1/3 rather than T 1/2 as in the bandit case Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 12 / 25

K-person games There are K players choosing actions I 1,t,..., I K,t Each player i has its own loss function l i ( I1,t,..., I K,t ) What happens if all players use exponentially weighted forecasting, or similar algorithms? Correlated Convergence of empirical distribution of plays Hannan Nash Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 13 / 25

From game theory to machine learning UNLABELED DATA CLASSIFICATION SYSTEM GUESSED LABEL TRUE LABEL OPPONENT Now opponent s moves y t have side information x t R d (e.g., text on a document) A repeated game between the player choosing a classifier and the opponent choosing an action (x t, y t ) Convergence to performance of best classifier in a given class (e.g., linear classifiers with bounded norm) Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 14 / 25

Online learning algorithms Simple: easy to implement Scalable: local optimization vs. global optimization Robust: inherit game-theoretic performance guarantees Versatile: classification, regression, ranking, structured prediction Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 15 / 25

Structured Prediction A combinatorial label space (sequences, trees) POS tagging: sentence sequence of POS tags Parsing: sentence parse tree Bilingual alignment: sentence pair alignment (matching) Letter to phoneme: word phoneme sequence Phrase-based translation: source sentence target sentence Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 16 / 25

Online learning in general spaces Some applications Reproducing kernel Hilbert spaces: efficiently embed data in high-dim space where linear classifiers can do well Bioinformatics, vision, language Linear space of matrices Integrating data sources, learning different tasks at once Banach spaces of models Financial data Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 17 / 25

Tracking linear classifiers If data source is not fitted well by any linear model, then comparing to the best linear model f is trivial We want instead compare to the best sequence f 1, f 2,... of linear models Adversarial tracking Bound on predictive performance reflects the opponent s trade-off between fit of sequence and total shift f t f t 1 dynamic overfitting control This is achieved by enforcing sparsity of the learner s model (expressed as a linear combination of past x t s) t Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 18 / 25

Tracking a shifting topic Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 19 / 25

Online active learning TRUE LABEL (UPON REQUEST) HUMAN EXPERT UNLABELED DATA CLASSIFIER GUESSED LABEL USER Observing the data process is cheap Observing the label process is expensive need to query the human expert Question How much better can we do by subsampling adaptively the label process? Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 20 / 25

A game with the opponent Opponent avoids causing mistakes on documents far away from decision surface vectorized document decision surface Probability of querying a document proportional to inverse distance to decision surface Performance guarantee remains unchanged w.r.t. the full sampling case Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 21 / 25

Experiments on Reuters corpus Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 22 / 25

Prediction on graphs Web, social networks, biological networks Predict labels on nodes (or links) Game-theoretic framework allows to derive principled algorithms without statistical assumptions Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 23 / 25

Node prediction What is the optimal number of mistakes when sequentially predicting the node labels of a given graph? This number is captured (to within log factors) by the cutsize of the graph s random spanning tree This is a density independent regularity measure of the graph labeling and there are efficient predictors that achieve this Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 24 / 25

Conclusions Online game-theoretic analysis provides nonstochastic foundations to machine learning good for nonstationary, adversarial sources Algorithms typically have good scaling properties due to local (rather than global) optimization Fruitful exchange of concepts between game theory and machine learning Interacting learners Multitask learning: same side information, different objectives Multiview learning: different side information, same objective Nicolò Cesa-Bianchi (Univ. di Milano) Game-Theoretic Approach 25 / 25