Predictive Analytics : Understanding and Addressing The Power and Limits of Machines, and What We Should do about it

Similar documents
Transer Learning : Super Intelligence

Stanford Center for AI Safety

Data-Starved Artificial Intelligence

Institute of Information Systems Hof University

DESIGN IN THE ERA OF THE ALGORITHM. josh bigmedium.com

Artificial Intelligence

NASA Perspective on Machine Learning

Climate Science and the Uncertainty Monster. Judith Curry

UK OFFICIAL. Crown copyright Published with the permission of the Defence Science and Technology Laboratory on behalf of the Controller of HMSO

TRUSTING THE MIND OF A MACHINE

Challenging the Situational Awareness on the Sea from Sensors to Analytics. Programme Overview

Expression Of Interest

Innovation for Defence Excellence and Security (IDEaS)

Voluntary Education Program Readiness (Force Education & Training)

Computing Disciplines & Majors

ENSURING READINESS WITH ANALYTIC INSIGHT

AI in Business Enterprises

THE STATE OF THE SOCIAL SCIENCE OF NANOSCIENCE. D. M. Berube, NCSU, Raleigh

Behaviors That Revolve Around Working Effectively with Others Behaviors That Revolve Around Work Quality

Applications Drive Progress!

IMGD 1001: Programming Practices; Artificial Intelligence

Academic Year

Artificial Intelligence: pros and cons

Heidi Robinson Today, I m going to talk to you about resiliency. Resiliency is not a term that is easily defined nor is it easily achievable. As I con

Challenges and Opportunities in the Changing Science & Technology Landscape

Social Network Analysis and Its Developments

ACCENTURE INDONESIA HELPS REALIZE YOUR

Limits to Dependability Assurance - A Controversy Revisited (Or: A Question of Confidence )

Multi-Robot Teamwork Cooperative Multi-Robot Systems

Privacy, Due Process and the Computational Turn: The philosophy of law meets the philosophy of technology

Virtual Model Validation for Economics

Intro to Systems Theory and STAMP John Thomas and Nancy Leveson. All rights reserved.

On research impacts. Risto Ritala Tampere University of Technology. Academy of Finland meeting Sept 20th, My background a mixed bag

The Forensic Architecture Project : Virtual imagery as evidence in the contemporary context of the war on terror

Cross Linking Research and Education and Entrepreneurship

Urban Big Data and City Dashboards: Praxis and Politics. Rob Kitchin NIRSA, National University of Ireland Maynooth

How do you teach AI the value of trust?

Exploring the value of emerging technology in the lean enterprise

Belief-based rational decisions. Sergei Artemov

Satellite Fleet for a Commercial Remote Sensing Company

Violent Intent Modeling System

Making Simple Decisions CS3523 AI for Computer Games The University of Aberdeen

STOA Workshop State of the art Machine Translation - Current challenges and future opportunities 3 December Report

Case 1 - ENVISAT Gyroscope Monitoring: Case Summary

The Three Laws of Artificial Intelligence

Appendix A A Primer in Game Theory

IMGD 1001: Programming Practices; Artificial Intelligence

RecordDNA DEVELOPING AN R&D AGENDA TO SUSTAIN THE DIGITAL EVIDENCE BASE THROUGH TIME

Executive Summary Industry s Responsibility in Promoting Responsible Development and Use:

Climate Change, Energy and Transport: The Interviews

Adjustable Group Behavior of Agents in Action-based Games

A Case Study on the Use of Unstructured Data in Healthcare Analytics. Analysis of Images for Diabetic Retinopathy

War of 2050: a Battle for Information, Communications, and Computer Security

Executive Summary. The process. Intended use

Engineered Resilient Systems DoD Science and Technology Priority

A SYSTEMIC APPROACH TO KNOWLEDGE SOCIETY FORESIGHT. THE ROMANIAN CASE

Table of Contents. Two Cultures of Ecology...0 RESPONSES TO THIS ARTICLE...3

SEAri Short Course Series

Intelligent Systems. Lecture 1 - Introduction

OECD WORK ON ARTIFICIAL INTELLIGENCE

MAGNT Research Report (ISSN ) Vol.6(1). PP , Controlling Cost and Time of Construction Projects Using Neural Network

The Key to the Internet-of-Things: Conquering Complexity One Step at a Time

Available online at ScienceDirect. Procedia Engineering 111 (2015 )

SEPTEMBER, 2018 PREDICTIVE MAINTENANCE SOLUTIONS

Fire Sprinkler Systems, Backflow Prevention, and Public Health and Safety: Working toward Consensus. James K. Doyle 1

Definition of Tame vs. Wicked Problems Handout. Sandra S. Batie. Michigan State University

The Role of Computer Science and Software Technology in Organizing Universities for Industry 4.0 and Beyond

Adopting Standards For a Changing Health Environment

An Integrated Expert User with End User in Technology Acceptance Model for Actual Evaluation

universe: How does a human mind work? Can Some accept that machines can do things that

Knowledge Enhanced Electronic Logic for Embedded Intelligence

HISS Einar Landre

Fundamental Research in Systems Engineering: Asking Why? rather than How?

The Job Interview: Here are some popular questions asked in job interviews:

Chess Beyond the Rules

Belgian Position Paper

COMMITTEE ON COMMODITY PROBLEMS

Proposers Day Workshop

Research Statement Arunesh Sinha aruneshs/

Science Integration Fellowship: California Ocean Science Trust & Humboldt State University

Transparency and Accountability of Algorithmic Systems vs. GDPR?

Artificial Intelligence and Asymmetric Information Theory. Tshilidzi Marwala and Evan Hurwitz. University of Johannesburg.

To make our social enterprise both useful and self-sustainable, it was decided to divide our services into three main categories:

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab. 김강일

FINNISH CENTER FOR ARTIFICIAL INTELLIGENCE

Bias Correction in Localization Problem. Yiming (Alex) Ji Research School of Information Sciences and Engineering The Australian National University

Creating a Poker Playing Program Using Evolutionary Computation

Systems Dependability Assessment

Ground Systems Department

ARGUING THE SAFETY OF MACHINE LEARNING FOR HIGHLY AUTOMATED DRIVING USING ASSURANCE CASES LYDIA GAUERHOF BOSCH CORPORATE RESEARCH

Machinery Prognostics and Health Management. Paolo Albertelli Politecnico di Milano

Why Foresight: Staying Alert to Future Opportunities MARSHA RHEA, CAE, PRESIDENT, SIGNATURE I, LLC

in the New Zealand Curriculum

Systematic Privacy by Design Engineering

The Odds Calculators: Partial simulations vs. compact formulas By Catalin Barboianu

Information products in the electronic environment

Proposed Curriculum Master of Science in Systems Engineering for The MITRE Corporation

Executive Summary. Chapter 1. Overview of Control

Four principles for selecting HCI research questions

Sensor Networks and the Future of Networked Computation

Transcription:

Predictive Analytics : Understanding and Addressing The Power and Limits of Machines, and What We Should do about it Daniel T. Maxwell, Ph.D. President, KaDSci LLC Copyright KaDSci LLC 2018 All Rights Reserved 1

The Functions of Models in Analysis Copyright KaDSci LLC 2018 All Rights Reserved 2 01. 02. 03. Models are used to: Explain, account for, or describe a phenomenon (Diagnostic) Predict, forecast, or estimate. (Predictive) Recommend a course of action (Prescriptive) Analysis uses models to 1 & 2. above are data focused Issues of fact 3 above is also predictive & includes decision maker (or decision making system) Goals and preferences Objectives Risk tolerance Synthesis Tell the Story Narrative Structured Process Analysis What are the Pieces? How do they work? BLUF You can t forget the thinking part!!!

Predictive Analytics & AI are here & they are all Models Copyright KaDSci LLC 2018 All Rights Reserved 3

What is Behind the Magic? Copyright KaDSci LLC 2018 All Rights Reserved 4 ALGORITHMS!! Regression With all sorts of new and fancy names Bayesian Learning Neural Nets Case Based Reasoning Influence Diagrams.. The Promise The Perils All algorithms (and by extension AI tools) rely on data.

Not All Data are Created Equal Copyright KaDSci LLC 2018 All Rights Reserved 5 Noisy Accurate Uncertain Certain Sparse Dense Small Big Perishable Persistent Unstructured Structured Coarse Precise "Without data you're just another person with an opinion" W. Edwards Deming Data Science and Data Engineering are different things The latter is necessary but not sufficient for providing effective analytics.

Stanislav Petrov The Man Who Prevented Nuclear War Copyright KaDSci LLC 2018 All Rights Reserved 6 Time September 26, 1983, Three weeks after KAL 007 was shot down. Lt. Col. Stanislav Petrov (Russian Air Force) observed sensor alerts indicating the US launched an ICBM, followed by five more. Russian standing orders called for an immediate counter strike against the US and NATO allies. He disobeyed those orders and declared the alert a false alarm He was neither praised or punished. The cause A rare alignment of clouds, sunlight, and Russian Satellites that watched North Dakota Why did he not report it: Five missiles were inconsistent with his understanding of how the US would attack Lack of corroborative evidence Alert system was new Alert passed through 30 layers of verification too quickly His civilian experience helped him to make that judgment He believed if one of his pure military colleagues had been on duty, the outcome could have been very different.

An Attempt to Optimize Law Enforcement Officer Hiring Copyright KaDSci LLC 2018 All Rights Reserved 7 Could be inexpensive and effective Hiring Strategy Mean Squared Error is 10X bigger than the coefficient Officer Coefficient ~ 100 Mean Squared Error ~ 1000 Y Intercept ~ -100 Could be expensive and ineffective Real Situation Notional Numbers Anonymized Organization

Analytics (Operations Research) 8 Mathematicians Psychologists Sociologists Academic & Scientific Expertise Theoreticians Political Scientists Historians Technologists Software Engineers Gov t Executives Gov t Managers Gov t Service Providers Military Domain Expertise Applied Sciences & Engineering Hardware Engineers Systems Engineers Algorithm Developers Data Modelers Operations Researchers Good Analytics Provides the foundation for useful models and tools Is multi-disciplinary, drawing on expertise from across the spectrum of science and engineering

Drawing on Psychology --- Human Perceptions of Uncertainty Sherman Sherman Kent, Kent, CIA CIA 93% 75% give or take about 12% Probable the large number of dots outside 50% give or take about 10% Chances about even the selected regions 30% give or take about 10% Probably not 7% 93% 75% 50% 30% 7% 100% Certainty The General Area of Possibility give or Sherman take about 6% Kent, Almost certain CIA Note give or take about 5% 0% 100% Impossibility Certainty give or take about 6% give or take about 12% Almost certainly not The General Area of Possibility Note, the large number Probably not of dots outside the selected regions. give or take about 10% give or take about 10% give or take about 5% 0% Impossibility Almost certain Probable Chances about even Almost certainly not Representing uncertainty is messy. Ambiguity exists even in epistemic uncertainty Note, the large number of dots outside the selected regions. 9

Drawing on the Science of Evidence 10 Facts: Assertion Data is nothing more (or less) than evidence There is science behind evidence (Schum, 1994) The weight (value) of evidence is based on quality criteria. Things like: It has been applied (loosely) by the IC. Bias Veracity Assertion: Failure to account for these evidential factors undermines the quality of machine reasoning. Observational Sensitivity These factors can and should be systematically addressed (and reported)

Outcome Consequences Pulling it all together The Analytic Problem Space 11 Vast Human Driven Machine Informed Limited Machine Driven Bounded Rationality is Real - The Goal should be to have machines and Humans collaborate for more effective decisions Simple Sparse/Noisy Problem Complexity Complicated Complex Wicked Plentiful/Reliable Data Speed sometimes Kills Decisions should be timely not necessarily fast

The Bottom Line -- What We Know Copyright KaDSci LLC 2018 All Rights Reserved 12 The importance of human in the loop increases with situational complexity Numbers and fluidity of objectives (wickedness of the problem) Impact and size of irreducible uncertainty on the situation There is settled science we can draw upon to improve predictive (inferential) performance It appears the solution lies in improving Man Machine Collaboration Nugget IARPA is already working the Problem (Hybrid Forecasting Program) There is no substitute for knowledgeable human eyes on the data Professor Loerch GMU

Bottom Line What We Should Do for Every Analysis Copyright KaDSci LLC 2018 All Rights Reserved 13 01. Develop a set of objectives that reflect organizational priorities and can be estimated (you can t measure the future) 02. Clearly describe your decision space a. Alternatives b. Outcomes c. Risk 03. Identify some way to estimate the relative contribution of your alternatives against the objectives (e.g. simulation, analytic models, expert opinion, etc.) 06. Exercise the heck out of the models so you understand what you see. 05. Synthesize all of these factors into a model or models using optimization or other search technique. 04. Identify and consider your constraints explicitly (financial, physical,..) Then AND ONLY THEN are we ready to: Trust predictions Make recommendations Automate decisions (implement as AI)

Copyright KaDSci LLC 2018 All Rights Reserved 14 You can t forget the thinking part!!! Questions??? Comments You can t forget the thinking part!!!

References Copyright KaDSci LLC 2018 All Rights Reserved 15 Hammond, K.R. (1996) Human Judgment and Social Policy, Irreducible Uncertainty, Inevitable Error, Unavoidable Injustice. Oxford University Press: New York. Kent, S. (1964) Words of estimative probability. Studies in Intelligence, 8, pp. 49-65. Schum, D. (1994) The Evidential Foundations of Probabilistic Reasoning, John Wiley and Sons, New York.

Back up Examples of Good Predictive Analytics Copyright KaDSci LLC 2018 All Rights Reserved 16 Decision Support for Magazine Cover Design

Modeling Approach Copyright KaDSci LLC 2018 All Rights Reserved 17 Document variables for 257 Issues Next Step Normalize sales to 2016 levels Generate synthetic data that Expands 257 issues to 6,000+ Input Cover Attributes Estimated sales/ With variance Learn the Forecasting Model Machine Learned Bayesian Network

Results as of 16 February Copyright KaDSci LLC 2018 All Rights Reserved 18 700,000 Major snowstorm date When anomalies occur, they will most likely be over-predictions 600,000 500,000 400,000 300,000 200,000 100,000 0 2% 6% 21% 3% 11% 5% Predicted Sales have been within 5.5% of Reported sales 86% of the time Given that Perfect Prediction is Not Possible The Goal is Not Perfect Forecasting The Goal is Better Decisions 1 2 3 4 5 6 Reported Predicted It s tough to make predictions, especially about the future Yogi Berra

Forecast Issue Copyright KaDSci LLC 2018 All Rights Reserved 19 Forecasted Sales March 7th 2016 cover options 513,9009 486,596 498,177 593,582 Selected Cover If we can believe the model & assuming a $3.00 profit per unit sold at the newsstand the opportunity cost of this decision is ~ $285K

Integrate Data Geographic Disaggregation Copyright KaDSci LLC 2018 All Rights Reserved 20 Current Model is at the summary level Social Media Analysis is indicating that opinions about celebrities vary by location For example Adelle is popular in the northeast US, weak negative perception in Pacific Northwest Document variables for 257 Issues Sales geographically Next Step Normalize sales to 2016 levels Generate synthetic data that Expands 257 issues to 6,000+ Input Cover Attributes Estimated sales/ With variance Celebrity Appeal Geographically Learn the Forecasting Model Machine Learned Bayesian Network