From Proof to Impact | Complete article: by Katherine Hoffmann & Lilian Lehmann, Stanford Social Innovation Review, Dec 5, 2013 |
Excerpts – In recent years randomized controlled trials (RCTs) have emerged as a “go-to” methodology for evaluating development interventions. Despite ongoing discussions around the ethics and epistemology of the method, we’re glad to see more evidence generated in a sector marked by so much uncertainty around which interventions work and why.
We’d like to offer four suggestions for bringing such interventions to scale. These insights are informed by our experience with the rollout of the Dispensers for Safe Water program in East Africa, though the views presented are our own. We believe that these lessons are also more broadly applicable to other projects looking to scale RCT-validated results.
1. Redefine your benchmarks.
RCTs are great at identifying effective interventions but less so at identifying cost-effective ones. Even if researchers include theoretical cost-effectiveness calculations in academic trial conclusions (and this is rare), it is difficult to project expenses at scale. Expansion makes such analysis possible.
For example, Innovations for Poverty Action (IPA) did an initial test of a chlorine dispenser system aimed at encouraging rural Kenyans to treat their drinking water. The original study focused on identifying optimal strategies for promoting community behavior change, but at scale, service delivery costs to remote rural areas grew increasingly salient. We at Dispensers for Safe Water had a theory about which delivery model was best—a cheap hub-and-spoke distribution system using schools or clinics as delivery points—and we tested it. While delivery costs were low under this model, adoption and the quality of service delivery also suffered. Surprisingly, a much pricier strategy—direct delivery to water points—was equally cost-effective because of increased impact through higher adoption.
2. Source ideas creatively.
As field staff gain experience with an intervention, encourage them to contribute to program research and development. In Uganda, we invited staff members to compete to see who could most effectively increase chlorine adoption at randomly selected water points within a fixed budget. While these ideas are “team-sourced,” we will rigorously evaluate them and the outcomes will be actionable. The jury’s still out, but we’re excited about the proposals generated and even more excited to see the results.
3. Learn from and test past experiences.
Natural variation in implementation contexts can lead to unpredictable results. Prepare projects so that there is room to reflect regularly (to determine what worked well) and test proactively (to determine whether it will work well in the future).
When we first expanded into Uganda, we had an extremely successful pilot that topped average adoption by about 30 percentage points, so we went back and talked to the initial pilot’s implementing staff to figure out why (we believe it was a combination of time spent with local leaders and frequent follow-up with communities). Dispensers for Safe Water is now trying to replicate the strategies used in that early pilot on a larger scale, testing the result with a randomized, controlled evaluation, of course.
4. Iterate, evaluate, and course-correct continuously.
Scale-up offers an incredible opportunity for learning through increased sample size, but it’s important to recall that the ultimate aim of scale is to improve people’s lives. The benefit of moving beyond the confines of an RCT is that once you’ve gathered convincing evidence that a new idea works, you can change course mid-way without worrying about preserving the integrity of the “treatment” and “control” groups for the duration of a formal trial. This rapidly reduces the time horizon on learning and iteration.
In Kenya, we piloted the use of megaphones to publicize community education meetings at the village level in the hope of increasing meeting attendance (a strong correlate of adoption). When a preliminary analysis showed that these were associated with higher attendance over a sample of about 200 dispensers, we adopted them across the dispensers remaining in that installation round, iterating in real time.