Mauricio Romero

Ph.D. Candidate – Department of Economics

Job Market Paper

  • Outsourcing Service Delivery in a Fragile State: Experimental Evidence from Liberia (Joint with Justin Sandefur and Wayne Sandholtz)

  • Can outsourcing public service delivery improve outcomes in fragile states? We present results from a field experiment to study the Partnership Schools for Liberia program, which delegated management of 93 public schools --- staffed by government teachers and run free of charge to students --- to private providers. We randomly assigned treatment status at the school level and sampled students from pre-treatment enrollment records to identify the effectiveness of the treatment without confounding the effect of endogenous sorting of pupils into schools. After one academic year, students in outsourced schools scored 0.18σ higher in English and math than those in control schools. Private providers had higher expenditures per student, and mediation analysis suggests that roughly half of the learning gains were due to more inputs while half were due to better management practices. Our design allows us to study heterogeneity across providers: While the highest-performing providers generated increases in learning of 0.27σ, the lowest-performing providers had no impact on learning. We find behavior that is consistent with the incentives and rules set by the contracts. There is no evidence that providers engaged in student selection, which was explicitly prohibited. However, one provider shifted pupils from oversubscribed schools and underperforming teachers to other government schools. This provider was the only one whose funding was not linked to the number of students enrolled, and whose contract did not forbid direct dismissal of teachers. These results suggest that leveraging the private sector to improve service delivery in fragile states is promising, but also highlight the importance of procurement rules and contracting details in aligning public and private interests.

Research Statement

Working Papers

  • Inputs, Incentives, and Complementarities in Primary Education: Experimental Evidence from Tanzania
    (Joint with Karthik Muralidharan, Isaac Mbiti, Youdi Schipper, Constantine Mandak, and Rakesh Rajani) - AEA RCT registration
    Draft available upon request.

  • The idea that complementarities across jointly implemented policies can lead to increasing returns has a long tradition in economics. Yet there is limited evidence that clearly identifies such complementarities. We present evidence from a randomized experiment across a representative sample of 350 schools in Tanzania that studied the impact of providing schools with (a) unconditional capitation grants, (b) bonus payments to teachers based on student performance, and (c) both of the above. At the end of two years, we find no impact on student test scores from providing either the grants or teacher incentives but find significant positive effects from providing both. We find strong evidence of complementarities between improving school inputs and teacher incentives, with the combined effect being significantly greater than the sum of the individual effects. Our results suggest that improving teacher incentives can also improve the productivity of additional school resources, whereas simply augmenting school inputs may not have much impact on learning outcomes.

  • Cross-Age Tutoring: Experimental Evidence from Kenya (Joint with Lisa Chen and Noriko Magari)

  • There is an increasing wealth of evidence that teaching appropriate to the student's learning level can improve learning outcomes in low-income countries. Cross-age tutoring, where older students tutor younger students, is an inexpensive alternative for providing personalized instruction to younger students at the cost of the older student's time. We present the results from a large RCT in Kenya, in which schools are randomly assigned to implement either an English or a math tutoring program. Students in grades 3-7 tutor students in grades 1-2 and preschool. We find that tutoring in math, relative to tutoring in English, has a small positive effect (0.06 SD, p-value of 0.073) on math test scores. These results do not hold true for English tutoring, however: relative to math tutoring, it has no positive effect on English test scores (we can rule out an effect of 0.077 SD with 95\% confidence). We show that there is considerable heterogeneity according to the student's baseline learning level: The effect is largest for students in the middle of the ability distribution (0.144 SD, p-value of 0.005), while the point estimates are almost zero for students with either very low or very high baseline learning levels. Finally, we show that tutors are neither harmed by nor benefit from the program.

  • Benefit plans, insurer competition, and pharmaceutical prices: Evidence from Colombia

  • Public health benefit plans must choose what services are to be covered with public funds. This coverage choice may affect the prices of covered services through multiple channels. First, coverage reduces out-of-pocket expenditure, making consumers less sensitive to the cost of treatment; in an environment where suppliers have market power (as is often the case with pharmaceutical drugs) this could result in higher prices. The second channel is an increase in competition among drugs listed in the benefit plan with the same therapeutic properties, which could result in lower prices. Thus, the net effect on prices is unclear and depends on consumer sensitivity to prices and the level of competition among drugs. Using a difference-in-difference strategy, I study the effect of including a pharmaceutical drug in the national benefit plan of Colombia, a country with a competitive health insurance market in which all insurance companies offer the same plan (the national benefit plan) and charge the same premium. I find that drug prices decrease by 16\% on average after they are listed in the benefit plan and that sales increase by 124\%. However, if a drug faces no competition and is listed in the benefit plan, the price increases by 11\%. Coverage also affects the prices of unlisted services: Within a therapeutic class, the prices of drugs that are \textit{not} listed in the benefit plan decrease as the market share of competing drugs listed in the benefit plan increases. I conclude with a discussion of the role of financial incentives in health care markets.

  • Local incentives and national tax evasion: The response of illegal mining to a tax reform in Colombia (Joint with Santiago Saavedra)

  • National governments can only tax the economic activity they either directly observe or that is reported by municipal authorities. In this paper we investigate how illegal mining, a very common phenomenon in Colombia, changed with a tax reform that reduced the share of revenue transferred back to mining municipalities. To overcome the challenge of measuring illegal activity, we construct a novel dataset using machine learning predictions on satellite imagery features. Theoretically we expect illegal mining to increase because the amount required to bribe the local authority is smaller after the reform. Using a difference-in-differences strategy, with Peru as the control, we find that illegal mining increased by 1.41 percentage points as a share of overall mining activity. In addition, we provide suggestive evidence that illegal mines have more harmful health effects on the surrounding population than legal mines. These results illustrate the unintended effects of tax revenue redistribution.

  • The Effect of Gold Mining on the Health of Newborns (Joint with Santiago Saavedra)

  • Mining can propel economic growth, but often results in heavy metal releases that could negatively impact human health. Using a difference-in-differences strategy we estimate the impact of gold mining on the health of newborns in Colombia. We find heterogeneous effects depending on where mothers are located with respect to a mine. Mothers living in the vicinity of a mine are positively affected, experiencing a reduction of 0.51 percentage points in the probability of having a child with low APGAR score at birth (from a basis of 4.5\%). However, we find a negative effect on mothers living downstream from a mine, whose probability of having a child with a low APGAR score at birth increases by 0.45 percentage points. We provide suggestive evidence that contaminated fish consumption in the first weeks of gestation is the mechanism behind our results, based on an exogenous increase in fish consumption caused by a religious celebration.

  • Using Instrumental Variables under Partial Observability of Endogenous Variables for Assessing Effects of Air Pollution on Health
    (Joint with Tarik Benmarhnia and Prashant Bharadwaj) - Submitted - Draft available upon request

  • Instrumental variable (IV) methods are frequently used to estimate causal effects in epidemiological studies due to unmeasured confounders in observational studies. While this method has been used for a long time in the economics literature, its use has extended more recently into the medical and public health literature. In this paper, we review the literature that uses IV methods to assess the impact of atmospheric air pollution on health outcomes and point out an important but largely unemphasized assumption that is implicit in most papers using this methodology. The intuition that forms the basis of this paper is simple: While instruments are often used to create plausibly exogenous variation in single pollutants, recognizing that pollutants are generally co-produced and that any instrument that affects one component of pollution (PM10, for instance) is likely to affect other pollutants not considered in the analysis (SO2, for instance) is important. If pollutants are co-produced, IV models that only treat a single pollutant as endogenous would still lead to biased estimates. The direction of bias depends on how co-pollutants interact with each other and the instrument. Hence, in some cases it will not be possible to assess whether biased IV estimates are any closer to the true estimates compared to OLS. We recommend that authors who use IV methods examine the impact of air quality on health to make specific assumptions about co-pollutant production and the way in which their chosen instrument interacts with these co-pollutants.

  • Cross-cutting Treatments and (Incorrect) Inference in Experiments (Joint with Karthik Muralidharan and Kaspar Wuthrich)
    To be presented at the C4ED conference on Development Economics (November 29 to December 1, 2017) in Mannheim, Germany.
    Slides Available after December 1st, 2017.

  • Cross-cutting or factorial designs are widely used in both field and lab experiments. These designs provide a cost-effective way to study multiple treatments in the same experiment, but depend crucially on the assumption that there are no interaction effects between treatments. We show that incorrectly assuming that interactions are zero leads to incorrect inference about the treatments of interest. We show that an alternative approach, where the researcher first tests whether the interaction is significant and then decides to ignore it if it is not significant, also leads to incorrect statistical inference (due to ignoring the model selection implied by this procedure). We document that a large number of experimental studies feature such designs and do not account for these challenges to inference. We discuss different approaches to mitigate this problem: First, we investigate inference approaches which impose prior knowledge about the interaction. We demonstrate that while incorporating prior information can improve power, it will naturally result in misleading inferences whenever the prior information is incorrect. Second, we consider different approaches that control size irrespective of the validity of the prior information. We discuss a Bonferroni-style correction and a nearly optimal test targeting power to likely values of the interaction and compare them to a simple t-test based on the long model. While Bonferroni-style corrections do not lead to power improvements relative to the $t$-test based on the long model, the nearly optimal tests can lead to considerable power improvements relative to t-tests.

  • Improving The Effectiveness of Replication in Economics (Joint with Paul Gertler and Sebastian Galiani) [NBER Working Paper No. 23576]
    Conditionally accepted at Nature

  • Replication is a critical component of scientific credibility, as it increases our confidence in the reliability of the knowledge generated by original research. Yet replication is the exception rather than the rule in economics. In this paper, we examine why replication is so rare and propose changes to the incentives to replicate. Our study focuses on software code replication, which seeks to replicate the results in the original paper using the same data as the original study and verify that the analysis code is correct. We analyze the effectiveness of the current model for code replication in the context of three desirable characteristics: Lack of bias, fairness, and efficiency. We find substantial evidence of ``overturn bias'' that likely leads to many false positives in terms of ``finding'' or claiming mistakes in the original analysis. Overturn bias comes from the fact that replications that overturn original results are much easier to publish than those that confirm original results. In a survey of editors, almost all responded they would in principle publish a replication study that overturned the results of the original study, but only 29\% responded that they would consider publishing a replication study that confirmed the original study results. We also find that most replication effort is devoted to so-called important papers and that the cost of replication is high, because posited data and software are very hard to use. We outline a new model in which journals conduct replication after acceptance and before publication, in order to solve the incentive problems raised in this paper.

Work in progress

Policy and Popular Writing

Non-Economics Publications