Local average treatment effect explained

In econometrics and related empirical fields, the local average treatment effect (LATE), also known as the complier average causal effect (CACE), is the effect of a treatment for subjects who comply with the experimental treatment assigned to their sample group. It is not to be confused with the average treatment effect (ATE), which includes compliers and non-compliers together. Compliance refers to the human-subject response to a proposed experimental treatment condition. Similar to the ATE, the LATE is calculated but does not include non-compliant parties. If the goal is to evaluate the effect of a treatment in ideal, compliant subjects, the LATE value will give a more precise estimate. However, it may lack external validity by ignoring the effect of non-compliance that is likely to occur in the real-world deployment of a treatment method. The LATE can be estimated by a ratio of the estimated intent-to-treat effect and the estimated proportion of compliers, or alternatively through an instrumental variable estimator.

The LATE was first introduced in the econometrics literature by Guido W. Imbens and Joshua D. Angrist in 1994, who shared one half of the 2021 Nobel Memorial Prize in Economic Sciences.[1] [2] As summarized by the Nobel Committee, the LATE framework "significantly altered how researchers approach empirical questions using data generated from either natural experiments or randomized experiments with incomplete compliance to the assigned treatment. At the core, the LATE interpretation clarifies what can and cannot be learned from such experiments."

The phenomenon of non-compliant subjects (patients) is also known in medical research.[3] In the biostatistics literature, Baker and Lindeman (1994) independently developed the LATE method for a binary outcome with the paired availability design and the key monotonicity assumption.[4] Baker, Kramer, Lindeman (2016) summarized the history of its development.[5] Various papers called both Imbens and Angrist (1994) and Baker and Lindeman (1994) seminal.[6] [7] [8] [9]

An early version of LATE involved one-sided noncompliance (and hence no monotonicity assumption). In 1983 Baker wrote a technical report describing LATE for one-sided noncompliance that was published in 2016 in a supplement. In 1984, Bloom published a paper on LATE with one-sided compliance.[10] For a history of multiple discoveries involving LATE see Baker and Lindeman (2024).[11]

General definition

The typical terminology of the Rubin causal model is used to measure the LATE, with units indexed

i=1,\ldots,N

and a binary treatment indicator,

zi

for unit

i

. The term

Yi(zi)

is used to denote the potential outcome of unit

i

under treatment

zi

.

In an ideal experiment, all subjects assigned to the treatment will comply with the treatment, while those that are assigned to control will remain untreated. In reality, however, the compliance rate is often imperfect, which prevents researchers from identifying the ATE. In such cases, estimating the LATE becomes the more feasible option. The LATE is the average treatment effect among a specific subset of the subjects, who in this case would be the compliers.

Potential outcome framework

The LATE is defined within the potential outcomes framework of causal inference. The treatment effect for subject

i

is

Yi(1)-Yi(0)

. It is impossible to simultaneously observe

Yi(1)

and

Yi(0)

for the same subject. At any given time, only a subject in its treated

Yi(1)

or untreated

Yi(0)

state can be observed.

Through random assignment, the expected untreated potential outcome of the control group is the same as that of the treatment group, and the expected treated potential outcome of the treatment group is the same as that of the control group. The random assignment assumption thus allows one to take the difference between the average outcome in the treatment group and the average outcome in the control group as the overall average treatment effect, such that:

ATE=E[Yi(1)-Yi(0)]=E[Yi(1)]-E[Yi(0)]=E[Yi(1)|Zi=1]-E[Yi(0)|Zi=0]

Non-compliance framework

Researchers frequently encounter non-compliance problems in their experiments, whereby subjects fail to comply with their experimental assignments. In an experiment with non-compliance, the subjects can be divided into four subgroups: compliers, always-takers, never-takers and defiers. The term

di(z)

represents the treatment that subject

i

actually takes when their treatment assignment is

zi

.

Compliers are subjects who will take the treatment if and only if they were assigned to the treatment group, i.e., the subpopulation with

di(1)=1

and

di(0)=0

.

Non-compliers are composed of the three remaining subgroups:

di(z)=1

di(z)=0

di(1)=0

and

di(0)=1

Non-compliance can take two forms: one-sided (always-takers and never-takers) and two-sided (defiers). In the case of one-sided non-compliance, a number of the subjects who were assigned to the treatment group remain untreated. Subjects are thus divided into compliers and never-takers, such that

di(0)=0

for all

i

, while

di(1)=0

or

1

. In the case of two-sided non-compliance, a number of the subjects assigned to the treatment group fail to receive the treatment, while a number of the subjects assigned to the control group receive the treatment. In this case, subjects are divided into the four subgroups, such that both

di(0)

and

di(1)

can be 0 or 1.

Given non-compliance, certain assumptions are required to estimate the LATE. Under one-sided non-compliance, non-interference and excludability is assumed. Under two-sided non-compliance, non-interference, excludability, and monotonicity is assumed.

Assumptions under one-sided non-compliance

The non-interference assumption, otherwise known as the Stable Unit Treatment Value Assumption (SUTVA), is composed of two parts.[12]

di

, of subject

i

depends only on the subject's own treatment assignment status,

zi

. The treatment assignment status of other subjects will not affect the treatment status of subject

i

. Formally, if

zi=zi'

, then

Di(z)=Di(z')

, where

z

denotes the vector of treatment assignment status for all individuals.[13]

i

's potential outcomes are affected by its own treatment assignment, and the treatment it receives as a consequence of that assignment. The treatment assignment and treatment status of other subjects will not affect subject

i

's outcomes. Formally, if

zi=zi'

and

di=di'

, then

Yi(z,d)=Yi(z',d)

.

The excludability assumption requires that potential outcomes respond to treatment itself,

di

, not treatment assignment,

zi

. Formally

Yi(z,d)=Yi(d)

. So under this assumption, only

d

matters.[14] The plausibility of the excludability assumption must also be assessed on a case-by-case basis.

Assumptions under two-sided non-compliance

i

,

di(1)\geqdi(0)

. This states that if a subject were moved from the control to treatment group,

di

would either remain unchanged or increase. The monotonicity assumption rules out defiers, since their potential outcomes are characterized by

di(1)<di(0)

.[1] Monotonicity cannot be tested, so like the non-interference and excludability assumptions, its validity must be determined on a case-by-case basis.

Identification

The

LATE=

ITT
ITTD

, whereby

ITT=E[Yi(z=1)]-E[Yi(z=0)]

ITTD=E[di(z=1)]-E[di(z=0)]

The

ITT

measures the average effect of experimental assignment on outcomes without accounting for the proportion of the group that was actually treated (i.e., an average of those assigned to treatment minus the average of those assigned to control). In experiments with full compliance, the

ITT=ATE

.

The

ITTD

measures the proportion of subjects who are treated when they are assigned to the treatment group, minus the proportion who would have been treated even if they had been assigned to the control group, i.e.,

ITTD

= the share of compliers.

Proof

Under one-sided noncompliance, all subjects assigned to control group will not take the treatment, therefore:

E[di(z=0)]=0

,

so that

ITTD=E[di(z=1)]=P[di(1)=1]

If all subjects were assigned to treatment, the expected potential outcomes would be a weighted average of the treated potential outcomes among compliers, and the untreated potential outcomes among never-takers, such that

\begin{align}{\displaystyleE[Yi(z=1)]=E[Yi(d(1),z=1)]}=E[Yi(z=1,d=1)|di(1)=1]*P[di(1)=1]&\+E[Yi(z=1,d=0)|di(1)=0]*(1-P[di(1)=1]) \end{align}

If all subjects were assigned to control, however, the expected potential outcomes would be a weighted average of the untreated potential outcomes among compliers and never-takers, such that

\begin{align}{\displaystyleE[Yi(z=0)]=E[Yi(d=0,z=0)]}=E[Yi(z=0,d=0)|di(1)=1]*P[di(1)=1]&\+E[Yi(z=0,d=0)|di(1)=0]*(1-P[di(1)=1]) \end{align}

Through substitution, the ITT is expressed as a weighted average of the ITT among the two subpopulations (compliers and never-takers), such that

\begin{alignat}{2}ITT=E[Yi(z=1)]-E[Yi(z=0)]=E[Yi(z=1,d=1)-Yi(z=0,d=0)|di(1)=1]*P[di(1)=1]+&\ E[Yi(z=1,d=0)-Yi(z=0,d=0)|di(1)=0]*P[di(1)=0] \end{alignat}

Given the exclusion and monotonicity assumption, the second half of this equation should be zero.

As such,

\begin{align}

ITT
ITTD

=&

E[Yi(z=1,d=1)-Yi(z=0,d=0)|di(1)=1]*P[di(1)=1]
P[di(1)=1]

\ =&E[Yi(d=1)-Yi(d=0)|di(1)=1] \ =&LATE \end{align}

Application: hypothetical schedule of the potential outcome under two-sided noncompliance

The table below lays out the hypothetical schedule of potential outcomes under two-sided noncompliance.

The ATE is calculated by the average of

Yi(d=1)-Yi(d=0)

Hypothetical Schedule of Potential Outcome under Two-sided Noncompliance!Observation!

Yi(0)

!

Yi(1)

!

Yi(1)-Yi(0)

!

di(z=0)

!

di(z=1)

!Type
147301Complier
235200Never-taker
315401Complier
458311Always-taker
5410601Complier
628600Never-taker
7610401Complier
859401Complier
925311Always-taker

ATE=

3+2+4+3+6+6+4+4+3=
9
35
9

=3.9

LATE is calculated by ATE among compliers, so

LATE=

3+4+6+4+4
5

=4.2

ITT is calculated by the average of

Yi(z=1)-Yi(z=0)

,

so

ITT=

3+0+4+0+6+0+4+4+0=
9
21
9

=2.3

ITTD

is the share of compliers

ITTD=

5
9
ITT
ITTD

=

21/9=
5/9
21
5

=4.2=LATE

Others: LATE in instrumental variable framework

LATE can be thought of through an IV framework.[15] Treatment assignment

zi

is the instrument that drives the causal effect on outcome

Yi

through the variable of interest

di

, such that

zi

only influences

Yi

through the endogenous variable

di

, and through no other path. This would produce the treatment effect for compliers.

In addition to the potential outcomes framework mentioned above, LATE can also be estimated through the Structural Equation Modeling (SEM) framework, originally developed for econometric applications.

SEM is derived through the following equations:

Di=\alpha0+\alpha1Zi+\xi1i

Yi=\beta0+\beta1Zi+\xi2i

The first equation captures the first stage effect of

zi

on

di

, adjusting for variance, where

\alpha1=Cov(D,Z)/var(Z)

The second equation

\beta1

captures the reduced form effect of

zi

on

Yi

,

\beta1=Cov(Y,Z)/var(Z)

The covariate adjusted IV estimator is the ratio

\tauLATE=

\beta1=
\alpha1
Cov(Y,Z)/Var(Z)
Cov(D,Z)/Var(Z)

=

Cov(Y,Z)
Cov(D,Z)

Similar to the nonzero compliance assumption, the coefficient

\alpha1

in first stage regression needs to be significant to make

z

a valid instrument.

However, because of SEM’s strict assumption of constant effect on every individual, the potential outcomes framework is in more prevalent use today.

Generalizing LATE

The primary goal of running an experiment is to obtain causal leverage, and it does so by randomly assigning subjects to experimental conditions, which sets it apart from observational studies. In an experiment with perfect compliance, the average treatment effect can be obtained. However, many experiments are likely to experience either one-sided or two-sided non-compliance. In the presence of non-compliance, the ATE can no longer be recovered. Instead, what is recovered is the average treatment effect for a certain subpopulation known as the compliers, which is the LATE.

When there may exist heterogeneous treatment effects across groups, the LATE is unlikely to be equivalent to the ATE. In one example, Angrist (1989)[16] attempts to estimate the causal effect of serving in the military on earnings, using the draft lottery as an instrument. The compliers are those who were induced by the draft lottery to serve in the military. If the research interest is on how to compensate those involuntarily taxed by the draft, LATE would be useful, since the research targets compliers. However, if researchers are concerned about a more universal draft for future interpretation, then the ATE would be more important (Imbens 2009).

Generalizing from the LATE to the ATE thus becomes an important issue when the research interest lies with the causal treatment effect on a broader population, not just the compliers. In these cases, the LATE may not be the parameter of interest, and researchers have questioned its utility.[17] [18] Other researchers, however, have countered this criticism by proposing new methods to generalize from the LATE to the ATE.[19] [20] [21] Most of these involve some form of reweighting from the LATE, under certain key assumptions that allow for extrapolation from the compliers.

Reweighting

The intuition behind reweighting comes from the notion that given a certain strata, the distribution among the compliers may not reflect the distribution of the broader population. Thus, to retrieve the ATE, it is necessary to reweight based on the information gleaned from compliers. There are a number of ways that reweighting can be used to obtain the ATE from the LATE.

Reweighting by ignorability assumption

By leveraging instrumental variables, Aronow and Carnegie (2013) propose a new reweighting method called Inverse Compliance Score weighting (ICSW), with a similar intuition behind IPW. This method assumes compliance propensity is a pre-treatment covariate and compliers would have the same average treatment effect within their strata. ICSW first estimates the conditional probability of being a complier (Compliance Score) for each subject by Maximum Likelihood estimator given covariates control, then reweights each unit by its inverse of compliance score, so that compliers would have covariate distribution that matches the full population. ICSW is applicable at both one-sided and two-sided noncompliance situation.

Although one's compliance score cannot be directly observed, the probability of compliance can be estimated by observing the compliance condition from the same strata,  in other words those that share the same covariate profile. The compliance score is treated as a latent pretreatment covariate, which is independent of treatment assignment

Z

. For each unit

i

, compliance score is denoted as P_=Pr(D_1>D_0|X=x_i), where

xi

is the covariate vector for unit

i

.

In one-sided noncompliance case, the population consists of only compliers and never-takers. All units assigned to the treatment group that take the treatment will be compliers. Thus, a simple bivariate regression of D on X can predict the probability of compliance.

In two-sided noncompliance case, compliance score is estimated using maximum likelihood estimation.

By assuming probit distribution for compliance and of Bernoulli distribution of D,

where 

\hat{\Pr{ci}}=\hat{\Pr}(D1>D0|X=xi)=F(\hat{\theta}

A,C,xi
)(1-F(\hat{\theta}
A|A,C,xi

))3

.

and 

\theta

is a vector of covariates to be estimated,

F(.)

is the cumulative distribution function for a probit model

By the LATE theorem,[1] average treatment effect for compliers can be estimated with equation:

\tauLATE=

n
\sum{Zi
i=1
Yi
n
/\sum
i=1

{Zi}-\sum

n
i=1

{(1-Zi)}{Yi}/\sum

n
i=1

{(1-Zi)}}{

n
\sum
i=1

{Zi}{Di}/\sum

n
i=1

{Zi}-\sum

n
i=1

{(1-Zi)}{Di}/\sum

n
i=1

{(1-Zi)}}

Define

\hat{wCi

}=1/\hat the ICSW estimator is simply weighted by:

\tauATE=

n
\sum\hat{Wi
i=1
Zi

{Yi}/\sum

n
i=1

\hat{Wi}{Zi}-\sum

n
i=1

\hat{Wi}{(1-Zi)}{Yi}/\sum

n
i=1

{\hat{Wi}(1-Zi)}}{

n
\sum
i=1

\hat{Wi}{Zi}{Di}/\sum

n
i=1

\hat{Wi}{Zi}-\sum

n
i=1

\hat{Wi}{(1-Zi)}{Di}/\sum

n
i=1

\hat{Wi}{(1-Zi)}}

This estimator is equivalent to using 2SLS estimator with weight .

An essential assumption of ICSW relying on treatment homogeneity within strata, which means the treatment effect should on average be the same for everyone in the strata, not just for the compliers. If this assumption holds, LATE is equal to ATE within some covariate profile. Denote as:

forallx\inSupp(X),E[Y1-Y0|D1>D0]

Notice this is a less restrictive assumption than the traditional Ignorability assumption, as this only concerns the covariate sets that are relevant to compliance score, which further leads to heterogeneity, without considering all sets of covariates.

The second assumption is consistency of

\hat{PrCi

} for

PrCi

 and the third assumption is the nonzero compliance for each strata, which is an extension of IV assumption of nonzero compliance over population. This is a reasonable assumption as if compliance score is zero for certain strata, the inverse of it would be infinite.

ICSW estimator is more sensible than that of IV estimator, as it incorporate more covariate information, such that the estimator might have higher variances. This is a general problem for IPW-style estimation. The problem is exaggerated when there is only a small population in certain strata and compliance rate is low. One way to compromise it to winsorize the estimates, in this paper they set the threshold as =0.275. If compliance score for lower than 0.275, it is replaced by this value. Bootstrap is also recommended in the entire process to reduce uncertainty(Abadie 2002).[22]

Reweighting under monotonicity assumption

In another approach, one might assume that an underlying utility model links the never-takers, compliers, and always-takers. The ATE can be estimated by reweighting based on an extrapolation of the complier treated and untreated potential outcomes to the never-takers and always-takers. The following method is one that has been proposed by Amanda Kowalski.

First, all subjects are assumed to have a utility function, determined by their individual gains from treatment and costs from treatment. Based on an underlying assumption of monotonicity, the never-takers, compliers, and always-takers can be arranged on the same continuum based on their utility function. This assumes that the always-takers have such a high utility from taking the treatment that they will take it even without encouragement. On the other hand, the never-takers have such a low utility function that they will not take the treatment despite encouragement. Thus, the never-takers can be aligned with the compliers with the lowest utilities, and the always-takers with the compliers with the highest utility functions.

In an experimental population, several aspects can be observed: the treated potential outcomes of the always-takers (those who are treated in the control group); the untreated potential outcomes of the never-takers (those who remain untreated in the treatment group); the treated potential outcomes of the always-takers and compliers (those who are treated in the treatment group); and the untreated potential outcomes of the compliers and never-takers (those who are untreated in the control group). However, the treated and untreated potential outcomes of the compliers should be extracted from the latter two observations. To do so, the LATE must be extracted from the treated population.

Assuming no defiers, it can be assumed that the treated group in the treatment condition consists of both always-takers and compliers. From the observations of the treated outcomes in the control group, the average treated outcome for always-takers can be extracted, as well as their share of the overall population. As such, the weighted average can be undone and the treated potential outcome for the compliers can be obtained; then, the LATE is subtracted to get the untreated potential outcomes for the compliers. This move will then allow extrapolation from the compliers to obtain the ATE.

Returning to the weak monotonicity assumption, which assumes that the utility function always runs in one direction, the utility of a marginal complier would be similar to the utility of a never-taker on one end, and that of an always-taker on the other end. The always-takers will have the same untreated potential outcomes as the compliers, which is its maximum untreated potential outcome. Again, this is based on the underlying utility model linking the subgroups, which assumes that the utility function of an always-taker would not be lower than the utility function of a complier. The same logic would apply to the never-takers, who are assumed to have a utility function that will always be lower than that of a complier.

Given this, extrapolation is possible by projecting the untreated potential outcomes of the compliers to the always-takers, and the treated potential outcomes of the compliers to the never-takers. In other words, if it is assumed that the untreated compliers are informative about always-takers, and the treated compliers are informative about never-takers, then comparison is now possible among the treated always-takers to their “as-if” untreated always-takers, and the untreated never-takers can be compared to their “as-if” treated counterparts. This will then allow the calculation of the overall treatment effect. Extrapolation under the weak monotonicity assumption will provide a bound, rather than a point-estimate.

Limitations

The estimation of the extrapolation to ATE from the LATE requires certain key assumptions, which may vary from one approach to another. While some may assume homogeneity within covariates, and thus extrapolate based on strata, others may instead assume monotonicity. All will assume the absence of defiers within the experimental population. Some of these assumptions may be weaker than others—for example, the monotonicity assumption is weaker than the ignorability assumption. However, there are other trade-offs to consider, such as whether the estimates produced are point-estimates, or bounds. Ultimately, the literature on generalizing the LATE relies entirely on key assumptions. It is not a design-based approach per se, and the field of experiments is not usually in the habit of comparing groups unless they are randomly assigned. Even in case when assumptions are difficult to verify, researchers can incorporate through the foundation of experiment design. For example, in a typical field experiment where instrument is “encouragement to treatment”, treatment heterogeneity could be detected by varying intensity of encouragement. If the compliance rate remains stable under different intensity, if could be a signal of homogeneity across groups.

Further reading

Notes and References

  1. Imbens . Guido W. . Angrist . Joshua D. . March 1994 . Identification and Estimation of Local Average Treatment Effects . Econometrica . 62 . 2 . 467 . 10.2307/2951620 . 0012-9682 . 2951620.
  2. Web site: The Committee for the Prize in Economic Sciences in Memory of Alfred Nobel . 2021-10-11 . Answering causal questions using observational data. Scientific Background on the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel 2021 .
  3. Moerbeek, M., & Schie, S. van. (2019). What are the statistical implications of treatment non‐compliance in cluster randomized trials: A simulation study. In Statistics in Medicine (Vol. 38, Issue 26, pp. 5071–5084). Wiley. https://doi.org/10.1002/sim.8351
  4. Baker . Stuart G. . Lindeman . Karen S. . 1994-11-15 . The paired availability design: A proposal for evaluating epidural analgesia during labor . Statistics in Medicine . 13 . 21 . 2269–2278 . 10.1002/sim.4780132108 . 7846425 . 0277-6715.
  5. Baker . Stuart G. . Kramer . Barnett S. . Lindeman . Karen S. . 2018-10-30 . "Latent class instrumental variables: A clinical and biostatistical perspective" . Statistics in Medicine . 38 . 5 . 901 . 10.1002/sim.6612 . 0277-6715 . 30761594 . free. 4715605 .
  6. Swanson . Sonja A. . Hernán . Miguel A. . Miller . Matthew . Robins . James M. . Richardson . Thomas S. . 2018-04-03 . Partial Identification of the Average Treatment Effect Using Instrumental Variables: Review of Methods for Binary Instruments, Treatments, and Outcomes . Journal of the American Statistical Association . 113 . 522 . 933–947 . 10.1080/01621459.2018.1434530 . 31537952 . 6752717 . 0162-1459.
  7. Lee . Kwonsang . Lorch . Scott A. . Dylan S. Small . Small . Dylan S. . 2019-02-20 . Sensitivity analyses for average treatment effects when outcome is censored by death in instrumental variable models . Statistics in Medicine . 38 . 13 . 2303–2316 . 10.1002/sim.8117 . 30785641 . 0277-6715. 1802.06711 . 73458979 .
  8. Sheng . E . Estimating causal effects of treatment in RCTs with provider and subject noncompliance. . Statistics in Medicine . 2019 . 38 . 5 . 738–750. 10.1002/sim.8012 . 30347462 . 53035814 .
  9. Wang . L . Bounded, efficient and multiply robust estimation of average treatment effects using instrumental variable . 2016 . 1611.09925v4 .
  10. Bloom . Howard S. . April 1984 . Accounting for No-Shows in Experimental Evaluation Designs . Evaluation Review . en . 8 . 2 . 225–246 . 10.1177/0193841X8400800205 . 0193-841X.
  11. Baker . Stuart G. . Lindeman . Karen S. . 2024-04-02 . Multiple Discoveries in Causal Inference: LATE for the Party . CHANCE . en . 37 . 2 . 21–25 . 10.1080/09332480.2024.2348956 . 38957370 . 11218811 . 0933-2480.
  12. Rubin. Donald B.. January 1978. Bayesian Inference for Causal Effects: The Role of Randomization. The Annals of Statistics. 6. 1. 34–58. 10.1214/aos/1176344064. 0090-5364. free.
  13. Angrist. Joshua D.. Imbens. Guido W.. Rubin. Donald B.. June 1996. Identification of Causal Effects Using Instrumental Variables. Journal of the American Statistical Association. 91. 434. 444–455. 10.1080/01621459.1996.10476902. 0162-1459.
  14. Imbens. G. W.. Rubin. D. B.. 1997-10-01. Estimating Outcome Distributions for Compliers in Instrumental Variables Models. The Review of Economic Studies. 64. 4. 555–574. 10.2307/2971731. 0034-6527. 2971731.
  15. Hanck. Christoph. 2009-10-24. Joshua D. Angrist and Jörn-Steffen Pischke (2009): Mostly Harmless Econometrics: An Empiricist's Companion. Statistical Papers. 52. 2. 503–504. 10.1007/s00362-009-0284-y. 0932-5026. free.
  16. Angrist. Joshua. September 1990. The Draft Lottery and Voluntary Enlistment in the Vietnam Era. Cambridge, MA. 10.3386/w3514. free.
  17. Deaton. Angus. January 2009. Instruments of development: Randomization in the tropics, and the search for the elusive keys to economic development. Cambridge, MA. 10.3386/w14690. free.
  18. Heckman. James J.. Urzúa. Sergio. May 2010. Comparing IV with structural models: What simple IV can and cannot identify. Journal of Econometrics. 156. 1. 27–37. 10.1016/j.jeconom.2009.09.006. 20440375. 2861784. 0304-4076.
  19. Aronow. Peter M.. Carnegie. Allison. 2013. Beyond LATE: Estimation of the Average Treatment Effect with an Instrumental Variable. Political Analysis. 21. 4. 492–506. 10.1093/pan/mpt013. 1047-1987.
  20. Imbens. Guido W. June 2010. Better LATE Than Nothing: Some Comments on Deaton (2009) and Heckman and Urzua (2009). Journal of Economic Literature. 48. 2. 399–423. 10.1257/jel.48.2.399. 14375060 . 0022-0515.
  21. Kowalski. Amanda. 2016. Doing More When You're Running LATE: Applying Marginal Treatment Effect Methods to Examine Treatment Effect Heterogeneity in Experiments. NBER Working Paper No. 22363. 10.3386/w22363. free.
  22. Abadie. Alberto. March 2002. Bootstrap Tests for Distributional Treatment Effects in Instrumental Variable Models. Journal of the American Statistical Association. 97. 457. 284–292. 10.1198/016214502753479419. 0162-1459. 10.1.1.337.3129. 18983937 .