Game Theory for Safety and Security. Arunesh Sinha

Game Theory for Safety and Security Arunesh Sinha

Motivation: Real World Security Issues 2

Central Problem Allocating limited security resources against an adaptive, intelligent adversary 3

Prior Work Stackelberg Games have been very successful in practice 4

Defender-Adversary Interaction: Stackelberg Game Defender moves first laying out defense Adversary knows the defender s mixed strategy Does not know the coin flips Stackelberg Equilibrium: Optimal randomization Day: 53 12 4 p 1 =0.75 p 2 =0.25 1,0 3,3 9,9 1,0 5

Outline Threat Screening Games Audit games Crime Prediction using Learning in Games 6

Threat Screening Games Screening for Threats 7

Airport Passenger Screening Problem Threat Screening Games Transport Security Administration (TSA) screens 800 million passengers Dynamics Aviation Risk Management Solution (DARMS) [with USC/CREATE and Milind Tambe] An intelligent approach to screening passengers Screening Effectiveness Timely Screening 8

Actors Threat Screening Games Screener (TSA) Adversary (e.g., terrorist) Benign Screenees 9

Current Screening Approach Two broad passengers categories TSA Pre and general Same type of screening in each category (some exceptions, e.g. children) Long queues Lot of screening time spent on benign passengers 10

Proposed Solution Threat Screening Games Finer categories for passengers: risk levels and flight Randomized screening Low High Risk, Domestic International 5 40 95 210 X-Ray + Metal Detector X-Ray + AIT 090 30 X-Ray 04 20 Metal Detector 11

Actions of Players Threat Screening Games Defender: Allocation of screening teams to passengers Resource capacity constraints: For example, X-ray can be used only 40 times/ hour Passenger flow constraints: All passengers in all categories must be screened Adversary: Choose a passenger category to arrive in 12

Payoffs of Players Defender payoff: Measures the loss incurred from a successful attack Probability of attack is a function of the defender and adversary strategy Adversary payoff: Measures the gain from a successful attack Probability of attack is a function of the defender and adversary strategy 13

Optimization Problem Threat Screening Games Maximize defender payoff (i.e., minimize loss) Function of defender randomized strategy and adversary best response Subject to The adversary plays a best response 14

Technical Challenges Threat Screening Games Very large game: ~ 10 41 defender actions The equilibrium computation is NP Hard Current large scale optimization approaches like Column Generation (CG) fail Invalid solution with compact representation (CR) approach 15

Technical Contribution Threat Screening Games We propose the Marginal Guided Algorithm (MGA) Brown, Sinha, Schlenker, Tambe; One Size Does Not Fit All: A Game-Theoretic Approach for Dynamically and Effectively Screening for Threats [AAAI 2016] Runtime (seconds) 10000 1000 100 10 1 0.1 10 20 30 40 50 Flights MGA CG Screener Utility 0-1 -2-3 -4 10 20 30 40 50 Flights MGA CG 16

General Model for Screening Threat Screening Games 17

Outline Threat Screening Games Audit games Crime Prediction using Learning in Games 18

Privacy Concern in HealthCare Audit Games 19

What s Going On? Audit Games } Permissive access control regime } Trust employees to do the right thing } Malicious insiders can cause breaches 20

Auditing Audit Games Post-hoc inspection of employee accesses to patient health records Detect violations Punish violators Auditing is ubiquitous and effective against insider threat Financial auditing, computer security auditing 21

Audit Game Model Audit Games Auditors k Inspections, k n n suspicious cases Adversary 22

Actions of Players Audit Games Defender chooses a randomized allocation of limited resources Also, chooses a punishment level Adversary plays his best response: chooses a misdeed to commit Adversary gets punished if the misdeed is caught 23

Payoff of Players Defender payoff includes the loss incurred from a successful breach Probability of breach is a function of the defender and adversary strategy Defender payoff includes loss from a high punishment level (Punishment is not free) High punishment level Negative work environment -> loss for organization Immediate loss from punishment -> Suspension/Firing means loss for org. Adversary payoff includes the gain from a successful breach Probability of attack is a function of the defender and adversary strategy Adversary payoff includes the loss due to punishment when caught 24

Optimization Problem Audit Games Maximize defender payoff (i.e., minimize loss) Function of defender randomized strategy, punishment level and adversary best response Subject to The adversary plays a best response (non-linear) Used techniques like Second Order Cone Programs for fast computation [IJCAI 2013, AAAI 2015] 25

General Model for Auditing Punishment costs lead to tradeoff between deterrence and loss due to misdeed Optimal inspection allocation and punishment policy can be computed efficiently 26

Outline Threat Screening Games Audit games Crime Prediction using Learning in Games 27

A Big Problem Crime Prediction In 2009 7,857,000 crime $10,994,562,000 Urban Crime 28

Challenges Model behavior of criminals What is their utility? Criminals are not homogenous Crime has spatial aspects Opportunity Real world data about frequent defender-adversary interaction available 29

Predictive Policing Solution Crime Prediction Our contribution [AAMAS 2015, 2016] Learn crime and crime evolution in response to patrolling Then, design optimal patrols Distinct from crime predicts crime philosophy in criminology [Chen 2004; McCue 2015] Deployment: Licensed to a start-up ArmorWay; deployment in University of Southern California Accuracy 1 0.5 0 EMC2 Crime predicts crime Random 30

The Role of Learning in Stackelberg Games Data about past interaction Learn Adversary Behavior Adversary Model Plan Optimal Defender Strategy Defender Strategy 31

Domain Description Crime Prediction Five patrol areas Eight hour shifts Crime data: number of crimes/shift/area Patrol data: number of officers/shift/area 32

Learning Model Crime Prediction Dynamic Bayesian Network (AAMAS 15) Captures interaction between officers and criminals D: Number of defenders (known) X: Number of criminals (hidden) Y: Number of crimes (known) T: Step = Shift Expectation-Maximization with intelligent factoring T T+1 33

Crime Prediction Input Defender Strategy Planning Search problem Search space grows exponentially with the number of steps that are planned ahead DBN (Criminal Model) DOGS algorithm (AAMAS 2015) Output Crime Number Apply Dynamic Programming in the search problem 34

Experimental Results Crime heat map without patrol Crime heat map with random patrol Crime heat map with optimal patrol 35

Data Enables Learning in Games Data about past interaction Learn Adversary Behavior Adversary Model Plan Optimal Defender Strategy Defender Strategy 36

Takeaway Game theory enables intelligent randomized allocation of limited security resources against an adaptive adversary 37

Thank You 38