MACHINE LEARNING AND FINTECH

Similar documents
The AI Awakening and the Challenge for Society

A Brief History of Artificial Intelligence and How It s Revolutionizing Customer Service Today

AIMICT.ORG AIMICT Newsletter

Tech is Here to Stay and Changing Everyday: Here s How Those Changes Can Help You With excerpts from an interview with Jean Robichaud, CTO, of

TRUSTING THE MIND OF A MACHINE

The A.I. Revolution Begins With Augmented Intelligence. White Paper January 2018

THE AI REVOLUTION. How Artificial Intelligence is Redefining Marketing Automation

What we are expecting from this presentation:

Global Standards Symposium. Security, privacy and trust in standardisation. ICDPPC Chair John Edwards. 24 October 2016

Human + Machine How AI is Radically Transforming and Augmenting Lives and Businesses Are You Ready?

Executive summary. AI is the new electricity. I can hardly imagine an industry which is not going to be transformed by AI.

AI in Business Enterprises

What We Talk About When We Talk About AI

PRELIMINARY AGENDA. Europe s Largest Global Lending and Fintech Event October, 2017 InterContinental London The O2

Oxford Fintech Programme

AI for Autonomous Ships Challenges in Design and Validation

The Three Laws of Artificial Intelligence

Innovations in Reinsurance. Andre Eisele, Swiss Re Head of Client Management P&C ANZ RDG Presentation 8 March 2017

Jeff Bezos, CEO and Founder Amazon

Quick work: Memory allocation

Technology transfer industry shows gains

Artificial Intelligence for Social Impact. February 8, 2018 Dr. Cara LaPointe Senior Fellow Georgetown University

Artificial Intelligence in the World. Prof. Levy Fromm Institute Spring Session, 2017

Game Playing. Garry Kasparov and Deep Blue. 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.

Public Administration Challenges in the Age of AI and Bots. PK Agarwal Dean and CEO

5 False Beliefs That Hurt Client Retention for Hair Salons

A Gift of Fire: Social, Legal, and Ethical Issues for Computing Technology (Fourth edition) by Sara Baase. Term Paper Sample Topics

The Roller-Coaster History of Artificial Intelligence and its Impact on the Practice of Law

Artificial Intelligence: Its happening NOW. Madhuban Kumar

Introduction to Intellectual Property

3 rd December AI at arago. The Impact of Intelligent Automation on the Blue Chip Economy

The dos Santos Group at Morgan Stanley

Virtual Assistants and Self-Driving Cars: To what extent is Artificial Intelligence needed in Next-Generation Autonomous Vehicles?

Artificial Intelligence

Artificial Intelligence in distribution

Comparative study of SME development in Uzbekistan and Kazakhstan. Lyubov Tsoy CWRD intern Supervisor Dai Chai Song

Please find below and/or attached an Office communication concerning this application or proceeding.

Artificial Intelligence in the Credit Department. Bob Karau CICP Manager of Client Financial Services Robins Kaplan LLP

Business Method Patents, Innovation, and Policy. Bronwyn H. Hall UC Berkeley and NBER

MATRIX SAMPLING DESIGNS FOR THE YEAR2000 CENSUS. Alfredo Navarro and Richard A. Griffin l Alfredo Navarro, Bureau of the Census, Washington DC 20233

Game Design Verification using Reinforcement Learning

PURPOSE OF THIS EBOOK

Probability (Devore Chapter Two)

Patent Basics for Inventors, Entrepreneurs, and Start-ups

Computer Science as a Discipline

Nokia Technologies in 2016 Technology to move us forward.

Guidance on using this template to create your Business Plan

The real impact of using artificial intelligence in legal research. A study conducted by the attorneys of the National Legal Research Group, Inc.

Chapter 30: Game Theory

Global Game Jam Accessibility Challenge

Creating Projects for Practical Skills

UNIT 13A AI: Games & Search Strategies. Announcements

Welcome to Questel s Webinar

Greater Binghamton, New York

The Tech Megatrends: 2018

Artificial Intelligence and Law. Latifa Al-Abdulkarim Assistant Professor of Artificial Intelligence, KSU

Teresa V. Pahl Partner

Human vs Computer. Reliability & Competition

A Balanced Introduction to Computer Science, 3/E

TESTING AI IN ONE ARTIFICIAL WORLD 1. Dimiter Dobrev

BETTER AT BEING HUMAN, THANKS TO AI

Artificial Intelligence in Business: Opportunities & Challenges

10 Critical Steps to Successfully Flipping Houses

Training a Neural Network for Checkers

The Rise of the Conversational Assistant White Paper

Thinking and Autonomy

TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

Introduction to Artificial Intelligence

AI & Law. What is AI?

How to Overcome the Top Ten Objections for Financial Advisors

Consideration of Utilization of Artificial Intelligence for Business Innovation

Promoting Foreign Direct Investment in The United States. Christopher Clement International Investment Specialist Invest in America

Describes the operation of multiplying by ten as adding a nought

Digital Identity Innovation Canada s Opportunity to Lead the World. Digital ID and Authentication Council of Canada Pre-Budget Submission

Unit 12: Artificial Intelligence CS 101, Fall 2018

Chip & Signature. An alternative option for anyone who has difficulties using a PIN 32.60

How to use messages on hold to grow your small business.

Financial Well-being BEGINNING YOUR JOURNEY

Online Card Sorting for. Sarah Markus

The greatest secret to success is not just knowing what to do... it s doing what we already know.

Disclosure: Within the past 12 months, I have had no financial relationships with proprietary entities that produce health care goods and services.

Lisa A. Dolak Senior Vice President and University Secretary Angela S. Cooney Professor of Law

Keywords: Immediate Response Syndrome, Artificial Intelligence (AI), robots, Social Networking Service (SNS) Introduction

Artificial Intelligence: Why businesses need to pay attention to artificial intelligence?

Processes are Driving Banking Innovation Innovation Needs Organizational Support to Succeed

SME Adoption of Wireless LAN Technology: Applying the UTAUT Model

TEMPORAL DIFFERENCE LEARNING IN CHINESE CHESS

Overview: Emerging Technologies and Issues

Trends Impacting the Semiconductor Industry in the Next Three Years

Monte Carlo Tree Search

Walgreens KINSON RUSSO NET LEASED TEAM

An Introduction to Machine Learning for Social Scientists

CSC321 Lecture 23: Go

Emily Dobson, Sydney Reed, Steve Smoak

HOW AI BOOSTS INDUSTRY SPOTLIGHTS INDUSTRY PROFITS AND INNOVATION

UNITED STATES INTERNATIONAL TRADE COMMISSION WASHINGTON, DC 20436

Navigating The Fourth Industrial Revolution: Is All Change Good?

Artificial Intelligence A Very Brief Overview of a Big Field

Webb s Depth of Knowledge: Transitioning to

The Content Experts EDITORIAL CALENDAR 2018

Transcription:

MACHINE LEARNING AND FINTECH Spencer McManus * CITE AS: 1 GEO. L. TECH. REV. 428 (2017) https://perma.cc/an4a-h6yg INTRODUCTION... 428 A BRIEF HISTORY OF MACHINE LEARNING... 428 PATTERN RECOGNITION AND EXPLANATION-BASED LEARNING: AN E-COMMERCE EXAMPLE... 430 Training Data... 430 Decision Trees... 431 Evaluating the Model... 432 EBL AND FRAUD DETECTION... 437 CONCLUSION... 435 INTRODUCTION Machine learning describes the process through which computers can learn without continued human input. In the era of big data, machine learning is particularly promising because it allows for identification of patterns in large data sets. Machine learning has applications in fields as diverse as medicine, e-commerce, and banking. This essay will discuss the application of machine learning, particularly explanation-based learning, to the financial tech industry, focusing on fraud detection. A BRIEF HISTORY OF MACHINE LEARNING The concept of machine learning first arose in 1950 with Alan Turing s paper Computing Machinery and Intelligence, 1 in which Turing proposed to answer the question, Can machines think? To answer this question, Turing crafted what became known as the Turing Test with three participants: one human judge, one human player, and one computer. The judge, placed separately from the human and the computer, aims to determine which of the two is a human and which is a computer. A computer passes the Turing Test when the judge cannot consistently distinguish the computer * GLTR Staff Member; Georgetown Law, J.D. expected 2018; University of California, Davis, B.A.S. 2015. 2017, Spencer McManus. 1 A.M. Turing, Computing Machinery & Intelligence, 59 MIND 433 (1950).

2017 GEORGETOWN LAW TECHNOLOGY REVIEW 429 player from the human player. 2 Turing predicted that humans could program computers that would pass the test by 2000. 3 Over the next four decades, scholars and programmers refined the concept of machine learning and developed new tests. In 1959, IBM programmer Arthur Samuel created a checkers program in which the computer improved progressively the more it played. 4 Programmers focused on developing machines that performed pattern recognition over the next two decades. These efforts culminated in the introduction of Explanation-Based Learning ( EBL ), 5 in which a machine uses a set of programmer-supplied training data to identify patterns, synthesize rules, and apply the rules to new sets of data. 6 From the 1990 s to today, work has transitioned to developing machines that can handle large amounts of data to draw conclusions. 7 Machine learning has been extended to include deep learning, which involves use of increased processing power to analyze visual and auditory data in real-time. 8 Large technology companies have developed their own proprietary machine learning code that acts as the backbone for certain features of their products. 9 Future development focuses on continued improvement in natural language processing which allows for human voice interaction with devices 10 and applying machine learning to new industries. 2 Id. at 442. 3 Id. 4 See Bernard Marr, A Short History of Machine Learning Every Manager Should Read, FORBES (Feb. 19, 2016, 2:31 AM), https://www.forbes.com/sites/bernardmarr/2016/02/19/ashort-history-of-machine-learning-every-manager-should-read/#67840fd815e7 [https://perma.cc/cgd3-deye]. 5 See infra Part II. 6 See generally Gerald Dejong & Raymond J. Mooney, Explanation-Based Learning: An Alternative View, 1 MACH. LEARNING 145 (1986). 7 See Marr, supra note 4. 8 See Robert D. Hof, Deep Learning, MIT TECH. REV., https://www.technologyreview.com/s/513696/deep-learning/ [https://perma.cc/9f68-thtj]. 9 Examples include Facebook s DeepFace, which powers the social network s facial detection feature, and numerous digital assistant applications, including Apple s Siri, Amazon s Alexa, and Microsoft s Cortana. See, e.g., Steven Levy, The ibrain is Here, and It s Already Inside Your Phone, BACKCHANNEL (Aug. 24, 2016), https://backchannel.com/an-exclusive-look-athow-ai-and-machine-learning-work-at-apple-8dbfb131932b#.6wi4d8qcy [https://perma.cc/z9kg-8lq8] (explaining how Apple uses machine learning in their products, including Siri). 10 See Perry Li, Natural Language Processing, 1 GEO. L. TECH. REV. 98 (2016), https://www.georgetownlawtechreview.org/natural-language-processing/gltr-11-2016/

430 GEORGETOWN LAW TECHNOLOGY REVIEW Vol 1:2 PATTERN RECOGNITION AND EXPLANATION-BASED LEARNING: AN E- COMMERCE EXAMPLE Although there are several methodologies for machine learning, this article focuses on explanation-based learning. Explanation-based learning ( EBL ) involves teaching a machine to detect patterns in data based on a set of programmer-supplied training data, using the patterns to create a rule and then applying the rule to larger sets of data to make predictions. A simplified but powerful example from the e-commerce industry will help illustrate the process. 11 Retail companies face the challenge of catering to individual customers in a growing global economy. Machine learning can help retailers by providing extremely personalized predictions about how an individual s shopping habits may change given a change in personal circumstances. The simple system illustrated here will involve a machine learning system predicting whether a customer is pregnant. Training Data The first requirement for a machine learning system is training data. Training data consists of different data points (called features ), which come together to form an individual record, and an output value (the target ). 12 Training data is necessary because the machine cannot make predictions without examples of how the different features affect the output. In our example, the features will be the customer s age and whether or not she purchases two products commonly associated with pregnancy. These features come together to form ten records : in this case, a purchasing history for one [https://perma.cc/s2au-skzl]; see generally Winfred Phillips, Introduction to Natural Language Processing, CONSORTIUM ON COGNITIVE SCI., INSTRUCTION (2006), http://www.mind.ilstu.edu/curriculum/protothinker/natural_language_processing.php [https://perma.cc/vlb9-jnhj]. 11 This example is based on the (in)famous Target baby club story, in which Target, using a machine learning model, predicted that a teenage customer was pregnant. Target started sending her baby coupons, which were discovered by her father, who had not yet been informed about the pregnancy. For background, including more on how e-commerce companies leverage machine learning, see Charles Duhigg, How Companies Learn Your Secrets, N.Y. TIMES MAGAZINE (Feb. 16, 2002), https://nyti.ms/2jebotd [https://perma.cc/9y2l-hclh]. 12 Michael Manapat, A Primer on Machine Learning for Fraud Detection, STRIPE, https://stripe.com/radar/guide (last visited Feb. 5, 2017) [https://perma.cc/d2qt-fwps].

2017 GEORGETOWN LAW TECHNOLOGY REVIEW 431 customer. The target is whether the customer was actually pregnant. 13 Table 1 shows this data. # AGE PREGNANCY TEST? PRE-NATAL SUPPLEMENTS? PREGNANT? 1 24 Yes Yes Yes 2 36 Yes No No 3 18 No No Yes 4 40 No Yes Yes 5 25 No Yes Yes 6 33 Yes No Yes 7 25 Yes Yes Yes 8 32 No No No 9 35 No No No 10 20 Yes Yes No TABLE 1. Training data for hypothetical pregnancy prediction. Decision Trees From this set of ten records and their corresponding outputs, the computer can form a decision tree, a process for evaluating the probability of the output occurring given the value of each feature. This is a decision tree for this problem, with row (A) showing the probabilities of pregnancy given the training data in Table 1. 13 This hypothetical example ignores the difficulty (and potentially legality) of determining if a customer was actually pregnant. See id.

432 GEORGETOWN LAW TECHNOLOGY REVIEW Vol 1:2 FIGURE 1. Decision tree for data in TABLE 1. Spencer McManus, 2017. With this small amount of data, the decision tree is not particularly useful to the retailer. The ten records do not capture the purchasing trends of the entire customer base; two branches of the tree remain empty. Imagine instead that a larger set of training data (say, with 10,000 records) produced the probabilities in row (B) in Figure 1. This data would be useful to a retailer, especially where the data produced high or low probabilities. Evaluating the Model Now that the model has been developed from an adequate training set, the retailer can utilize the model to make predictions about new customers. A retailer could, say, send coupons for baby products to a potentially pregnant customer who fits in one of the high probability categories. If the customer is indeed pregnant, the coupons might encourage her to shop at the retailer. However, no decision tree is perfect because of practical limitations in data collection. In this case, the decision tree uses a limited set of data to produce probabilities that a customer with a given shopping history is pregnant. The retailer needs to evaluate if its model is actually effective at predicting if a customer is pregnant.

2017 GEORGETOWN LAW TECHNOLOGY REVIEW 433 FIGURE 2. Evaluating the efficacy of a machine learning program. Spencer McManus, 2017. The shaded areas are where the retailer uses the model and determines that a customer is pregnant; the white areas are where the retailer determined the customer was not pregnant. The retailer should aim to minimize the number of women in the red shaded area (a false positive, where the retailer determined someone was pregnant, but she wasn t) and maximize the number of women in the green shaded area (where the retailer correctly identified someone as pregnant). The retailer can evaluate this by calculating precision and recall. 14 In our example, precision is the percentage of customers predicted to be pregnant who actually are. Higher precision indicates fewer false positives. Recall is the percentage of all pregnant customers who are identified by the model. Higher recall indicates fewer false negatives. There is a relationship between precision and recall. As a retailer raises the probability threshold for predicting someone is pregnant, it will reduce false positives (and thus increase precision), but it will also increase false negatives (thus reducing 14 See Tom Fawcett, An Introduction to ROC Analysis, 27 PATTERN RECOGNITION LETTERS 861, 865 (2006).

434 GEORGETOWN LAW TECHNOLOGY REVIEW Vol 1:2 recall). If a retailer decides that the model must have 85% certainty that a customer is pregnant instead of 75%, it will exclude customers whose product purchases suggest between a 75% and 85% probability of pregnant. It is very possible that customers in this range are pregnant, but that the increased probability threshold will produce false negatives for these customers. A retailer faces obvious obstacles in determining numbers of actually pregnant customers, but this could be accomplished through surveys of customers. By further refining the model through evaluation of the most predictive features, the trade-off between precision and recall can be reduced, creating a higher quality model and giving the retailer the maximum benefits of a machine learning system. EBL AND FRAUD DETECTION EBL is commonly used in the financial technology space to detect credit card fraud. Financial institutions often license fraud-detection software from third-parties. This software, in its most simplified form, utilizes hundreds or thousands of features to form a decision tree, producing probabilities used to predict if a transaction is fraudulent. 15 Using a system similar to that in the pregnancy example, fraud detection companies identify features that, when analyzed together, are highly predictive of fraud. In this simplified example, a fraud detection company could build a system using three different features to detect basic instances of fraud on a single card: the country of use for a charge, the charge amount, and the number of countries used in a given time period. 16 # COUNTRY CHARGE AMOUNT NO. COUNTRIES IN 24- HOUR PERIOD FRAUD? 1 USA $20 1 No 2 RUS $150 2 Yes 3 USA $200 2 No 4 CAN $10 1 Yes 5 CAN $15 1 No TABLE 2. Training data for a simple fraud detection model. 15 See, e.g., Manapat, supra note 12. 16 See id.

2017 GEORGETOWN LAW TECHNOLOGY REVIEW 435 Fraud detection can be quite difficult. From a cursory examination of the training data, there does not seem to be a consistent pattern. Although some transactions may be quite obviously fraud (such as record 2, where a charge was made in a country not present in other records), other patterns are not so evident (such as a correlation between number of countries and fraudulent charges). Machine learning becomes particularly useful in the fraud detection industry because it enables companies to quickly analyze complex sets of data. For example, the developer of this model may determine that the third feature is not particularly predictive of fraud. One benefit of EBL in this space is that companies can identify relevant features and exclude irrelevant ones. 17 EBL continues to grow in other financial spheres as well. Banks use EBL to analyze customer traits (including past defaults, job status, and marital status) to approve or reject loans. 18 Other financial institutions use EBL to power robo-advisors that advise customers on allocating investments and financial instruments. 19 In the future, EBL could power new security systems for banking (such as facial recognition) or even finance-specific customer service systems. 20 CONCLUSION Since Alan Turing first hypothesized a thinking machine in 1950, machine learning has developed into a powerful tool. In explanation-based learning, one of the many different types of machine learning, a human provides a set of training data, which includes several features and records, from which a machine extrapolates patterns and creates rules. We encounter these systems every day: in e-commerce and fraud detection, machine learning forms a critical backbone. Future development of EBL will focus on applying the technology to new technologies in the era of big data. 17 Machine Learning and Fraud Prevention, RAVELIN (last visited Feb. 9, 2017), https://www.ravelin.com/resources/machine-learning-and-fraud-prevention [https://perma.cc/8qmj-accl]. 18 Daniel Faggella, Machine Learning in Finance Present and Future Applications, TECHEMERGENCE (Aug. 15, 2016), http://techemergence.com/machine-learning-in-financeapplications/ [https://perma.cc/982m-ts47]. 19 Id. 20 Id.