Randomized Evaluations in Practice: Opportunities and Challenges. Kyle Murphy Policy Manager, J-PAL January 30 th, 2017

Similar documents
Why Randomize? Jim Berry Cornell University

Course Overview J-PAL HOW TO RANDOMIZE 2

Why Randomize? Dan Levy Harvard Kennedy School

Marc Shotland. J-PAL Global TRANSLATING RESEARCH INTO ACTION

Chapter 12 Summary Sample Surveys

The Savvy Survey #3: Successful Sampling 1

Innosup Supporting Experimentation in Innovation Agencies

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu Albuquerque, UNM

3. Data and sampling. Plan for today

Critical and Social Perspectives on Mindfulness

Key Words: age-order, last birthday, full roster, full enumeration, rostering, online survey, within-household selection. 1.

Unit 8: Sample Surveys

Polls, such as this last example are known as sample surveys.

Section 6.5 Conditional Probability

A Mathematical Analysis of Oregon Lottery Win for Life

Basic Probability Concepts

Such a description is the basis for a probability model. Here is the basic vocabulary we use.

Hypergeometric Probability Distribution

Table A.1 Variable definitions

Sample Surveys. Chapter 11

The enabling conditions for successful scaling: An overview

Investing in Knowledge: Insights on the Funding Environment for Research on Inequality Among Young People in the United States

1. Introduction and About Respondents Survey Data Report

Sampling Designs and Sampling Procedures

The Information Commissioner s response to the Draft AI Ethics Guidelines of the High-Level Expert Group on Artificial Intelligence

A Comprehensive Statewide Study of Gambling Impacts: Implications for Public Health

Elements in decision making / planning 4 Decision makers. QUESTIONS - stage A. A3.1. Who might be influenced - whose problem is it?

101 Sources of Spillover: An Analysis of Unclaimed Savings at the Portfolio Level

Compass. Review of the evidence on knowledge translation and exchange in the violence against women field: Key findings and future directions

Chapter /5 Simulations / 21

Agricultural Data Verification Protocol for the Chesapeake Bay Program Partnership

MATH 215 DISCRETE MATHEMATICS INSTRUCTOR: P. WENG

Chapter 1: Sets and Probability

Probability - Introduction Chapter 3, part 1

Economics Bulletin, 2014, Vol. 34 No. 2 pp

Mathematical Foundations HW 5 By 11:59pm, 12 Dec, 2015

Multi-Level Evaluation Design Challenges of A Mixed Methods Approach

Chapter 3 Monday, May 17th

Internet access and use in context

Precision Public Health Call for Proposals

APPENDIX 2.3: RULES OF PROBABILITY

Adversarial Reasoning: Sampling-Based Search with the UCT algorithm. Joint work with Raghuram Ramanujan and Ashish Sabharwal

Farmer field schools: from agricultural extension to adult education

Statistical Methods in Computer Science

Profiles of Internet Use in Adult Literacy and Basic Education Classrooms

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 13

Residential Paint Survey: Report & Recommendations MCKENZIE-MOHR & ASSOCIATES

Probability Paradoxes

Math 1313 Section 6.2 Definition of Probability

Applied Microeconometrics Chapter 5 Instrumental Variables with Heterogeneous Causal Effect

Fellowship Applications

Revisiting the USPTO Concordance Between the U.S. Patent Classification and the Standard Industrial Classification Systems

The Continuous Improvement Fund (CIF)

Gathering information about an entire population often costs too much or is virtually impossible.

Chapter 7 Homework Problems. 1. If a carefully made die is rolled once, it is reasonable to assign probability 1/6 to each of the six faces.

Health Technology Assessment of Medical Devices in Low and Middle Income countries: challenges and opportunities

Research Development Request - Profile Template. European Commission

Raise your hand if you rode a bus within the past month. Record the number of raised hands.

Foundations of Probability Worksheet Pascal

Chapter 1. Probability

Guidelines on Standardization and Patent Pool Arrangements

9 October Opportunities to Promote Data Sharing UCL and the YODA Project. Emma White. Associate Director

How to conduct a network scale-up survey

Data sources data processing

EXPLORATION DEVELOPMENT OPERATION CLOSURE

1. The chance of getting a flush in a 5-card poker hand is about 2 in 1000.

REPORT OF THE UNITED STATES OF AMERICA ON THE 2010 WORLD PROGRAM ON POPULATION AND HOUSING CENSUSES

Applying Behavioural Economics to Move to a More Sustainable Future

Settlement in the digital age:

Enfield CCG. CCG 360 o stakeholder survey 2015 Main report. Version 1 Internal Use Only Version 1 Internal Use Only

Oxfordshire CCG. CCG 360 o stakeholder survey 2015 Main report. Version 1 Internal Use Only Version 1 Internal Use Only

Southern Derbyshire CCG. CCG 360 o stakeholder survey 2015 Main report. Version 1 Internal Use Only Version 1 Internal Use Only

South Devon and Torbay CCG. CCG 360 o stakeholder survey 2015 Main report Version 1 Internal Use Only

Math March 12, Test 2 Solutions

DISPOSITION POLICY. This Policy was approved by the Board of Trustees on March 14, 2017.

Portsmouth CCG. CCG 360 o stakeholder survey 2015 Main report. Version 1 Internal Use Only Version 1 Internal Use Only

Probabilities and Probability Distributions

TWO BY TWO: A METHODOLOGICAL PERSPECTIVE ON THE USE OF EVIDENCE TO SUPPORT THE VALUE OF A HEALTH TECHNOLOGY

ETCC First Quarter-2012 Meeting CPUC Update. Ayat Osman, Ph.D. March 29, 2012 PG&E PEC, San Francisco

CHAPTER 6 PROBABILITY. Chapter 5 introduced the concepts of z scores and the normal curve. This chapter takes

Demand for Commitment in Online Gaming: A Large-Scale Field Experiment

Romance of the Three Kingdoms

Our position. ICDPPC declaration on ethics and data protection in artificial intelligence

Encouraging SME Participation in Public Procurement Markets in MENA

1. An office building contains 27 floors and has 37 offices on each floor. How many offices are in the building?

Sutton CCG. CCG 360 o stakeholder survey 2015 Main report. Version 1 Internal Use Only Version 1 Internal Use Only

The Hong Kong Polytechnic University. Subject Description Form

Learning from Evaluation when Context Matters

Climate Asia Research Overview

Vincent Thomas Mule, Jr., U.S. Census Bureau, Washington, DC

Dota2 is a very popular video game currently.

ANNEXURE II. PROFORMA I PROFORMA FOR NEW RESEARCH UNIVERSITY RESEARCH PROJECT PROPOSAL (Single copy only)

RESEARCH, MONITORING AND EVALUATION

Chapter 5: Probability: What are the Chances? Section 5.2 Probability Rules

ABHI Response to the Kennedy short study on Valuing Innovation

Learning to Play like an Othello Master CS 229 Project Report. Shir Aharon, Amanda Chang, Kent Koyanagi

Chapter 3: Elements of Chance: Probability Methods

4.1 Sample Spaces and Events

EXECUTIVE SUMMARY. St. Louis Region Emerging Transportation Technology Strategic Plan. June East-West Gateway Council of Governments ICF

Chapter 4: Sampling Design 1

Transcription:

Randomized Evaluations in Practice: Opportunities and Challenges Kyle Murphy Policy Manager, J-PAL January 30 th, 2017

Overview Background What is a randomized evaluation? Why randomize? Advantages and drawbacks of randomized evaluations Conclusions J-PAL CEGA ATAI 2

J-PAL s mission is to ensure that policy is informed by evidence and research is translated into action ABOUT IPA AND J-PAL 3

J-PAL s network of 142 professors use randomized evaluations to inform policy ABOUT IPA AND J-PAL 4

We have more than 770 ongoing and completed projects across 8 sectors in 69 countries ABOUT IPA AND J-PAL 5

Our work focuses on 8 sectors ABOUT IPA AND J-PAL 6

Agriculture project map J-PAL CEGA ATAI 7

Since the start of ATAI Category Total Farmers surveyed 111,351 Female farmers surveyed 47,845 Farmers whose behavior has changed 17,932 ATAI Awards 55 Unique ATAI projects 42 Countries with ATAI projects 14 Researchers on ATAI projects 89 J-PAL CEGA ATAI 9

Background

Evaluation Program Evaluation Impact Evaluation RCTs

What is the impact of this program? Program starts Primary Outcome Impact Time IPA & J-PAL WHY RANDOMIZE 12

How to measure impact? Impact is defined as a comparison between: 1. the outcome some time after the program has been introduced 2. the outcome at that same point in time had the program not been introduced (the counterfactual )

Impact: What is it? Program starts Impact Primary Outcome Time IPA & J-PAL WHY RANDOMIZE 14

Impact: What is it? Primary Outcome Program starts Impact Time IPA & J-PAL WHY RANDOMIZE 15

Counterfactual The counterfactual represents the state of the world that program participants would have experienced in the absence of the program Problem: Counterfactual cannot be observed Solution: We need to mimic or construct the counterfactual IPA & J-PAL WHY RANDOMIZE 16

Constructing the counterfactual Usually done by selecting a group of individuals that did not participate in the program This group is usually referred to as the control group or comparison group How this group is selected is a key decision in the design of any impact evaluation IPA & J-PAL WHY RANDOMIZE 17

Selecting the comparison group Idea: Select a group that is exactly like the group of participants in all ways except one: their exposure to the program being evaluated Goal: To be able to attribute differences in outcomes between the group of participants and the comparison group to the program (and not to other factors)

The problem of selection bias Individuals who participate in a program and those who do not are often different Comparing outcomes of these groups results in Impact of the program + pre-existing differences

Impact evaluation methods Randomized Experiments Random Assignment Studies Randomized Field Trials Social Experiments Randomized Controlled Trials (RCTs) Randomized Controlled Experiments Non- or Quasi-Experimental Pre-Post Simple Difference Differences-in-Differences Multivariate Regression Statistical Matching Interrupted Time Series Instrumental Variables Regression Discontinuity

What is a randomized evaluation?

Intervention Population is randomly split into 2 or more groups Outcomes for both groups are measured Comparison

Key steps in conducting a randomized evaluation 1. Design the study carefully 2. Collect baseline data 3. Randomly assign people to treatment or control 4. Verify that assignment is random 5. Monitor process so that integrity of experiments is not compromised IPA & J-PAL WHY RANDOMIZE 23

Key advantage of experiments Because members of the groups (treatment and control) do not differ systematically at the outset of the experiment, Any difference that subsequently arises between them can be attributed to the program rather than to other factors. Fewer assumptions, clearly explainable results J-PAL WHY RANDOMIZE 24 24

What can we learn from randomized evaluations?

NERICA in Sierra Leone Problem: Adoption of high-yielding crop varieties has been low Potential solution: Offer subsidies and trainings to increase take-up and yields What levels of subsidies are most effective? Does agronomic training help increase yields?

120 Communities

Free Rice 120 Communities 50% Subsidy Full Price

Trained Free Rice Not Trained 120 Communities 50% Subsidy Trained Not Trained Full Price Trained Not Trained

Trained Free Rice Not Trained

Yields only increased with training Yield increased 16% for trained farmers No increase without training Randomized design disentangled intervention components Revealed cost to ignoring extension

Benefits and Limitations of Randomized Evaluations

Benefits of Randomized Evaluations Tailoring the evaluation to the question Prospective evaluation Few assumptions, transparent findings

When to do a randomized evaluation? When there is an important question you want/need to answer When budgets are limited Timing--not too early and not too late Program is representative not gold plated Or tests an basic concept you need tested Time, expertise, and money to do it right Develop an evaluation plan to prioritize J-PAL WHAT IS EVALUATION 34

When NOT to do an RE When the program is premature and still requires considerable tinkering to work well When the project is on too small a scale to randomize into two representative groups If a positive impact has been proven using rigorous methodology and resources are sufficient to cover everyone After the program has already begun and you are not expanding elsewhere J-PAL WHAT IS EVALUATION 35

Common Real World Constraints J-PAL CEGA ATAI 36

Constraints: Political Advantages Not as severe as often claimed Lotteries are simple, common and transparent Randomly chosen from applicant pool Participants know the winners and losers Simple lottery is useful when there is no a priori reason to discriminate Perceived as fair Transparent J-PAL HOW TO RANDOMIZE 37

Constraints: Resources Most programs have limited resources Vouchers, Farmer Training Programs Results in more eligible recipients than resources will allow services for Limited resources can be an evaluation opportunity J-PAL HOW TO RANDOMIZE 38

Constraints: contamination Spillovers/Crossovers Remember the counterfactual! If control group is different from the counterfactual, our results can be biased Can occur due to Spillovers Crossovers J-PAL HOW TO RANDOMIZE 39

Constraints: logistics Need to recognize logistical constraints in research designs. E.g. individual de-worming treatment by health workers Many responsibilities. Not just de-worming. Serve members from both T/C groups Different procedures for different groups? J-PAL HOW TO RANDOMIZE 40

Constraints: fairness, politics Randomizing at the child-level within classes Randomizing at the class-level within schools Randomizing at the community-level J-PAL HOW TO RANDOMIZE 41

Constraints: sample size The program is only large enough to serve a handful of communities Primarily an issue of statistical power J-PAL HOW TO RANDOMIZE 42

Conclusion

Impact evaluations are hard to do well Badly done impact evaluation can be very misleading Can suggest that ineffective programs work Create noise of competing claims that drown out good evidence Good impact evaluations require: Good outcome measures Enough sample size to measure impact precisely Do an impact evaluation when we can answer an important question well Otherwise do a process evaluation and don t make impact claims Complement impact evaluations with other methods

Integrating RE into an Evaluation Strategy Good descriptive work important for diagnosing problem and selecting possible solutions If children get one vaccine but don t complete course, probably not cultural barrier. Business case assessment What would the impact need to be for this program to be cost-effective? Literature reviews tell you existing evidence, don t reinvent the wheel

Integrating RE into an Evaluation Strategy Process evaluation is always needed, can be dramatically improved What % of eligible people are taking up the product? Do people know more at the end of the training than at the beginning? If program shown to be effective in many contexts: time to scale Scaling needs to be complemented with a good process evaluation Randomized evaluations can be most useful where there is an important question from both programming and academia

Thank you! Kyle Murphy kmurphy@povertyactionlab.org

What if you have 500 applicants for 500 slots? Consider non-standard lottery designs Could increase outreach activities Is this ethical? J-PAL HOW TO RANDOMIZE 48

Sometimes screening matters Suppose there are 2000 applicants Screening of applications produces 500 worthy candidates There are 500 slots A simple lottery will not work What are our options? J-PAL HOW TO RANDOMIZE 49

Consider the screening rules What are they screening for? Which elements are essential? Selection procedures may exist only to reduce eligible candidates in order to meet a capacity constraint If certain filtering mechanisms appear arbitrary (although not random), randomization can serve the purpose of filtering and help us evaluate J-PAL HOW TO RANDOMIZE 50

Randomization in the bubble Sometimes a partner may not be willing to randomize among eligible people. Partner might be willing to randomize in the bubble. People in the bubble are people who are borderline in terms of eligibility Just above the threshold not eligible, but almost What treatment effect do we measure? What does it mean for external validity? J-PAL HOW TO RANDOMIZE 51

Randomization in the bubble Within the bubble, compare treatment to control Treatment Non-participants (scores <500) Participants (scores > 700) Control J-PAL HOW TO RANDOMIZE 52

When screening matters: Partial Lottery Program officers can maintain discretion Example: Training program Example: Expansion of consumer credit in South Africa J-PAL HOW TO RANDOMIZE 53

Phase-in: takes advantage of expansion Everyone gets program eventually Natural approach when expanding program faces resource constraints What determines which schools, branches, etc. will be covered in which year? J-PAL HOW TO RANDOMIZE 54

Phase-in design Round 1 Treatment: 1/3 Control: 2/3 Round 2 Treatment: 2/3 Control: 1/3 Randomization evaluation ends Round 3 Treatment: 3/3 Control: 0 2 3 3 1 3 1 3 2 1 2 3 2 2 3 1 3 1 2 2 3 1 1 3 1 3 1 1 2 2 3 2 3 2 3 2 2 3 1 1 3 1

Phase-in designs Advantages Everyone gets something eventually Provides incentives to maintain contact Concerns Can complicate estimating long-run effects Care required with phase-in windows Do expectations change actions today? J-PAL HOW TO RANDOMIZE 56

Rotation design Groups get treatment in turns Advantages? Concerns? J-PAL HOW TO RANDOMIZE 57

Rotation design Round 1 Treatment: 1/2 Control: 1/2 Round 2 Treatment from Round 1 Control Control from Round 1 Treatment J-PAL HOW TO RANDOMIZE 58

Encouragement design: What to do when you can t randomize access Sometimes it s practically or ethically impossible to randomize program access But most programs have less than 100% take-up Randomize encouragement to receive treatment J-PAL HOW TO RANDOMIZE 60

Encouragement design Encourage Do not encourage Participated Did not participate Complying Not complying J-PAL HOW TO RANDOMIZE 61

Encouragement design Encourage Do not encourage Participated Did not participate compare encouraged to not encouraged These must be correlated do not compare participants to nonparticipants Complying Not complying adjust for non-compliance in analysis phase J-PAL HOW TO RANDOMIZE 62

What is encouragement? Something that makes some folks more likely to use program than others Not itself a treatment For whom are we estimating the treatment effect? Think about who responds to encouragement J-PAL HOW TO RANDOMIZE 63

To summarize: Possible designs Simple lottery Randomization in the bubble Randomized phase-in Rotation Encouragement design Note: These are not mutually exclusive. J-PAL HOW TO RANDOMIZE 64

Methods of randomization - recap Design Most useful when Advantages Disadvantages Basic Lottery Program oversubscribed Familiar Easy to understand Easy to implement Can be implemented in public Control group may not cooperate Differential attrition J-PAL HOW TO RANDOMIZE 65

Methods of randomization - recap Design Most useful when Advantages Disadvantages Phase-In Expanding over time Everyone must receive treatment eventually Easy to understand Constraint is easy to explain Control group complies because they expect to benefit later Anticipation of treatment may impact short-run behavior Difficult to measure longterm impact J-PAL HOW TO RANDOMIZE 66

Methods of randomization - recap Design Most useful when Advantages Disadvantages Rotation Everyone must receive something at some point Not enough resources per given time period for all More data points than phase-in Difficult to measure long-term impact J-PAL HOW TO RANDOMIZE 67

Methods of randomization - recap Design Most useful when Advantages Disadvantages Encouragement Program has to be open to all comers When take-up is low, but can be easily improved with an incentive Can randomize at individual level even when the program is not administered at that level Measures impact of those who respond to the incentive Need large enough inducement to improve take-up Encouragement itself may have direct effect J-PAL HOW TO RANDOMIZE 68