Difference in differences (DID[1] or DD[2]) is a statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using observational study data, by studying the differential effect of a treatment on a 'treatment group' versus a 'control group' in a natural experiment.[3] It calculates the effect of a treatment (i.e., an explanatory variable or an independent variable) on an outcome (i.e., a response variable or dependent variable) by comparing the average change over time in the outcome variable for the treatment group to the average change over time for the control group. Although it is intended to mitigate the effects of extraneous factors and selection bias, depending on how the treatment group is chosen, this method may still be subject to certain biases (e.g., mean regression, reverse causality and omitted variable bias).
In contrast to a time-series estimate of the treatment effect on subjects (which analyzes differences over time) or a cross-section estimate of the treatment effect (which measures the difference between treatment and control groups), difference in differences uses panel data to measure the differences, between the treatment and control group, of the changes in the outcome variable that occur over time.
Difference in differences requires data measured from a treatment group and a control group at two or more different time periods, specifically at least one time period before "treatment" and at least one time period after "treatment." In the example pictured, the outcome in the treatment group is represented by the line P and the outcome in the control group is represented by the line S. The outcome (dependent) variable in both groups is measured at time 1, before either group has received the treatment (i.e., the independent or explanatory variable), represented by the points P1 and S1. The treatment group then receives or experiences the treatment and both groups are again measured at time 2. Not all of the difference between the treatment and control groups at time 2 (that is, the difference between P2 and S2) can be explained as being an effect of the treatment, because the treatment group and control group did not start out at the same point at time 1. DID, therefore, calculates the "normal" difference in the outcome variable between the two groups (the difference that would still exist if neither group experienced the treatment), represented by the dotted line Q. (Notice that the slope from P1 to Q is the same as the slope from S1 to S2.) The treatment effect is the difference between the observed outcome (P2) and the "normal" outcome (the difference between P2 and Q).
Consider the model
yit~=~\gammas(i)+λt+\deltaI(...)+\varepsilonit
where
yit
i
t
s(i)
i
I(...)
(...)
Y
\gammas
s
λt
\delta
\varepsilonit
Consider the average of the dependent variable and dummy indicators by group and time:
\begin{align} ns&=numberofindividualsingroups\\ \overline{y}st&=
1 | |
ns |
n | |
\sum | |
i=1 |
yit I(s(i)~=~s),\\ \overline{\gamma}s&=
1 | |
ns |
n | |
\sum | |
i=1 |
\gammas(i) I(s(i)~=~s)~=~\gammas,\\ \overline{λ}st&=
1 | |
ns |
n | |
\sum | |
i=1 |
λt I(s(i)~=~s)~=~λt,\\ Dst&=
1 | |
ns |
n | |
\sum | |
i=1 |
I(s(i)~=~treatment,tinafterperiod) I(s(i)~=~s)~=~I(s~=~treatment,tinafterperiod),\\ \overline{\varepsilon}st&=
1 | |
ns |
n | |
\sum | |
i=1 |
\varepsilonit I(s(i)~=~s), \end{align}
and suppose for simplicity that
s=1,2
t=1,2
Dst
\begin{align} &(\overline{y}11-\overline{y}12)-(\overline{y}21-\overline{y}22)\\[6pt] ={}&[(\gamma1+λ1+\deltaD11+\overline{\varepsilon}11)-(\gamma1+λ2+\deltaD12+\overline{\varepsilon}12)]\\ & {}-[(\gamma2+λ1+\deltaD21+\overline{\varepsilon}21)-(\gamma2+λ2+\deltaD22+\overline{\varepsilon}22)]\\[6pt] ={}&\delta(D11-D12)+\delta(D22-D21)+\overline{\varepsilon}11-\overline{\varepsilon}12+\overline{\varepsilon}22-\overline{\varepsilon}21. \end{align}
The strict exogeneity assumption then implies that
\operatorname{E}\left[(\overline{y}11-\overline{y}12)-(\overline{y}21-\overline{y}22)\right]~=~\delta(D11-D12)+\delta(D22-D21).
Without loss of generality, assume that
s=2
t=2
D22=1
D11=D12=D21=0
\hat{\delta}~=~(\overline{y}11-\overline{y}12)-(\overline{y}21-\overline{y}22),
which can be interpreted as the treatment effect of the treatment indicated by
Dst
\gamma1=0
All the assumptions of the OLS model apply equally to DID. In addition, DID requires a parallel trend assumption. The parallel trend assumption says that
λ2-λ1
s=1
s=2
λst~:~λ22-λ21 ≠ λ12-λ11
As illustrated to the right, the treatment effect is the difference between the observed value of y and what the value of y would have been with parallel trends, had there been no treatment. The Achilles' heel of DID is when something other than the treatment changes in one group but not the other at the same time as the treatment, implying a violation of the parallel trend assumption.
To guarantee the accuracy of the DID estimate, the composition of individuals of the two groups is assumed to remain unchanged over time. When using a DID model, various issues that may compromise the results, such as autocorrelation[5] and Ashenfelter dips, must be considered and dealt with.
The DID method can be implemented according to the table below, where the lower right cell is the DID estimator.
yst | s=2 | s=1 | Difference | |
---|---|---|---|---|
t=2 | y22 | y12 | y12-y22 | |
t=1 | y21 | y11 | y11-y21 | |
Change | y21-y22 | y11-y12 | (y11-y21)-(y12-y22) |
Running a regression analysis gives the same result. Consider the OLS model
y~=~\beta0+\beta1T+\beta2S+\beta3(T ⋅ S)+\varepsilon
where
T
1
t=2
S
1
s=2
(T ⋅ S)
S=T=1
\begin{align} \hat{\beta}0&=\widehat{E}(y\midT=0,~S=0)\\[8pt] \hat{\beta}1&=\widehat{E}(y\midT=1,~S=0)-\widehat{E}(y\midT=0,~S=0)\\[8pt] \hat{\beta}2&=\widehat{E}(y\midT=0,~S=1)-\widehat{E}(y\midT=0,~S=0)\\[8pt] \hat{\beta}3&=[\widehat{E}(y\midT=1,~S=1)-\widehat{E}(y\midT=0,~S=1)]\\ & {}-[\widehat{E}(y\midT=1,~S=0)-\widehat{E}(y\midT=0,~S=0)], \end{align}
where
\widehat{E}(...\mid...)
T=1
S=0
\hat{\beta}1
\hat{\beta}1
\hat{\beta}2
T=1
\hat{\beta}1
\hat{\beta}2
(\DeltaYi=Yi,1-Yi,0)
\hat{\beta}1
\hat{\beta}3
\hat{\beta}1
\hat{\beta}2
\hat{\beta}3
\begin{align} \widehat{E}(y\midT=1,~S=0)&=\widehat{E}(y\midafterperiod,control)\\ [3pt]\\ &=
\widehat{E | |
(y |
I(afterperiod,control))}{\widehat{P}(afterperiod,control)}\ [3pt]\\ &=
| |||||||||
ncontrol |
=\overline{y}control,after\\ [3pt]\\ &=\overline{y}12\end{align}
T
S
\hat{\beta}3~=~(y11-y21)-(y12-y22).
But this is the expression for the treatment effect that was given in the formal definition and in the above table.
The Card and Krueger article on minimum wage in New Jersey, published in 1994,[6] is considered one of the most famous DID studies; Card was later awarded the 2021 Nobel Memorial Prize in Economic Sciences in part for this and related work. Card and Krueger compared employment in the fast food sector in New Jersey and in Pennsylvania, in February 1992 and in November 1992, after New Jersey's minimum wage rose from $4.25 to $5.05 in April 1992. Observing a change in employment in New Jersey only, before and after the treatment, would fail to control for omitted variables such as weather and macroeconomic conditions of the region. By including Pennsylvania as a control in a difference-in-differences model, any bias caused by variables common to New Jersey and Pennsylvania is implicitly controlled for, even when these variables are unobserved. Assuming that New Jersey and Pennsylvania have parallel trends over time, Pennsylvania's change in employment can be interpreted as the change New Jersey would have experienced, had they not increased the minimum wage, and vice versa. The evidence suggested that the increased minimum wage did not induce a decrease in employment in New Jersey, contrary to what some economic theory would suggest. The table below shows Card & Krueger's estimates of the treatment effect on employment, measured as FTEs (or full-time equivalents). Card and Krueger estimate that the $0.80 minimum wage increase in New Jersey led to a 2.75 FTE increase in employment.
New Jersey | Pennsylvania | Difference | ||
---|---|---|---|---|
February | 20.44 | 23.33 | −2.89 | |
November | 21.03 | 21.17 | −0.14 | |
Change | 0.59 | −2.16 | 2.75 |
A software example application of this research is found on the Stata's command -diff- [7] authored by Juan Miguel Villa.