Concept: Compliance refers to whether actual treatment coincides with assigned treatment (Gerber and Green 2012:131). One-sided non compliance is a case of failure to treat: no one in the control group gets treated and some people in the treatment group do not get treated.
Let \(d_i(z)\) be a potential outcome that tells us person \(i\)’s treatment status \(d_i\) when their treatment assignment is \(z_i\). For example, \(d_i(1) = 0\) refers to person \(i\) who has been assigned to treatment (\(z_i=1\)) but is not treated (\(d_i = 0\)).
We can now think of four types of subjects in any experiment:
Theorem 5.1 shows that CACE (complier average causal effect or the average treatment effect among compliers) is a ratio of two intent-to-treat parameters: the effect of treatment assignment on the outcome \(Y_i\) (called \(ITT\) or \(ITT_Y\)); and the effect of treatment assignment on treatment status \(D_i\) (called \(ITT_D\)). More specifically, \(CACE = \frac{ITT}{ITT_D}\)
Proof: Start with the ITT
or intent-to-treat effect:
\(\underbrace{E[Y_i | z_i = 1]}_{\text{Avg Outcome in Trt Group}} - \underbrace{E[Y_i | z_i = 0]}_{\text{Avg Outcome in Control Group}}\)
We can rewrite each of these quantities as a weighted average of potential outcomes of different types of subjects:
In the treatment group:
\(E[Y_i | z_i = 1] = E[Y_i | z_i = 1, d_i = 1]\cdot \pi_C + E[Y_i | z_i = 1, d_i = 0]\cdot(1 - \pi_C)\)
where \(\pi_C\) is the proportion of compliers in the treatment group. Since we assume there are only two types of subjects, \((1 - \pi_C)\) is the proportion of never takers in the treatment group.
We know that compliers reveal their treated potential outcome or \(Y_i(1)\) in the treatment group, and never takers reveal their untreated potential outcome or \(Y_i(0)\) in the treatment group. So we can simplify the above expression to read:
\(E[Y_i | z_i = 1] = E[Y_i(1) | \text{Complier}]\cdot \pi_C + E[Y_i(0) | \text{NT}]\cdot(1 - \pi_C)\)
In the control group:
\(E[Y_i | z_i = 0] = E[Y_i | z_i = 0, d_i = 0]\cdot \pi_C + E[Y_i | z_i = 0, d_i = 0]\cdot(1 - \pi_C)\)
In the control group, both compliers and never takers reveal their untreated potential outcome. So we can simplify this expression to read:
\(E[Y_i | z_i = 0] = E[Y_i(0) | \text{Complier}]\cdot \pi_C + E[Y_i(0) | \text{NT}]\cdot(1 - \pi_C)\)
Plugging-in these quantities
The \(ITT\) can be written as a weighted average of types:
\(E[Y_i | z_i = 1] - E[Y_i | z_i = 0] = E[Y_i(1) - Y_i(0) |\text{Complier}]\cdot \pi_C + E[Y_i(0) - Y_i(0) | \text{NT}]\cdot(1 - \pi_C)\)
Since we randomly assigned subjects to treatment groups, and assume excludability (treatment assignment (\(z_i\)) has no effect on outcomes other than through the treatment (\(d_i\)) itself), never-takers in the treatment and control groups have the same outcome on average. In other words, \(E[Y_i | z_i = 1, d_i = 0] = E[Y_i | z_i = 0, d_i = 0]\). The second term cancels out and we are left with:
\(E[Y_i | z_i = 1] - E[Y_i | z_i = 0] = E[Y_i(1) - Y_i(0) |\text{Complier}]\cdot \pi_C\)
Re-arranging terms we get:
\(E[Y_i(1) - Y_i(0) |\text{Complier}] = \frac{E[Y_i | z_i = 1] - E[Y_i | z_i = 0]}{\pi_C}\)
Note that \(ITT_D = E[d_i | z_i = 1] - E[d_i | z_i = 0] = \pi_C\) (i.e. the average value of \(d_i\) in the treatment group is just the proportion of compliers, and the average value of \(d_i\) in the control group is 0 since no one is treated, so the difference is just the proportion of compliers or \(\pi_C\)).
Theorem 5.1 states exactly this: \(CACE = \frac{ITT}{ITT_D} = \frac{E[Y_i | z_i = 1] - E[Y_i | z_i = 0]}{\pi_C}\)
Identification of the CACE requires the following four assumptions: random assignment, non-interference, excludability, and \(ITT_D > 0\).
Can you estimate the average treatment effect for compliers assuming one-sided non-compliance and using Theorem 5.1?
Quantity | Control | Treatment |
---|---|---|
Percent reached by callers | 0.00 | 47.31 |
Turnout among those not contacted by canvassers | 55.89 | 40.50 |
Overall turnout | 55.89 | 56.53 |
Solution:
Lets start by re-writing the above information in a more intuitive way:
Quantity | Control | Treatment |
---|---|---|
Percent reached by callers | 0 | 47.31 |
Turnout among Never Takers | 40.5 | 40.5 |
Turnout among Compliers | Don’t Know | Don’t Know |
Overall turnout | 55.89 | 56.53 |
Step 1: To get the average control outcome for compliers, write the control group mean as a weighted average of types:
\(E[Y_i | Z = 0] = E[Y_i | Z=0, NT](1 - \pi_{C}) + E[Y_i | Z=0, C]\pi_{C}\)
\(0.5589 = 0.405\cdot(1 - 0.4731) + x\cdot(0.4731)\)
So \(x = \frac{0.5589 - (0.405 \cdot 0.5269)}{0.4731} = 0.7303\)
Step 2: To get the mean treated outcome for compliers, write the treatment group mean as a weighted average of types:
\(E[Y_i | Z = 1] = E[Y_i | Z=1, NT](1-\pi_{C}) + E[Y_i | Z=1, C]\pi_{C}\)
\(0.5653 = 0.405\cdot(1 - 0.4731) + x\cdot(0.4731)\)
Then \(x = \frac{0.5653 - (0.405 \cdot 0.5269)}{0.4731} = 0.7438\)
Step 3: The complier average causal effect can then be estimated in two ways:
\(CACE = E[Y_i | Z=1, \text{Compliers}] - E[Y_i | Z=0, \text{Compliers}] = 0.7438 - 0.7303 \approx 0.0135\)
Alternatively \(CACE = \frac{ITT_Y}{ITT_D} = \frac{0.5653 - 0.5589}{0.4731} \approx 0.0135\)
ivrobust
We will use iv_robust
from the estimatr
package to compute the complier average causal effect. For this part, I use the Guan and Green (2006) dataset. This is an experiment in which 4024 students in Peking University were randomly assigned to a treatment condition (contacted by a canvasser who encouraged them to vote), or a control condition in which they were not contacted by anyone. .
Let \(Z\) be the treatment assignment:\(z_i=1\) if contacted by a canvasser, \(z_i=0\) if assigned to not be contacted by a canvasser. Let \(D\) be the treatment: \(d_i=1\) if a canvasser actually contacted the student, and \(d_i=0\) if the student was not contacted by a canvasser. The outcome, turnout
equals 1 if the student votes, 0 if they do not.
Lets download the data set and look at compliance with treatment assignment in this study:
library(tidyverse)
library(estimatr)
# Download the data set
dat <- read_csv("W9_GuanGreen.csv") %>%
mutate(
Z = treat2,
D = contact
) %>%
select(-c(contact,treat2))
compliance <- dat %>% group_by(Z, D) %>% summarise(N = n())
kable(compliance,
caption = "Compliance With Treatment Assignment",
caption.above = TRUE)
Z | D | N |
---|---|---|
0 | 0 | 1334 |
1 | 0 | 307 |
1 | 1 | 2383 |
We can use lm_robust
to calculate the ITT (\(Y \sim Z\)) and ITT_D (\(D \sim Z\)), then take a ratio of those quantities to get the CACE:
# Approach 1: Estimate CACE by separately computing ITT and ITT_D
model_itt <- tidy(lm_robust(turnout ~ Z, data = dat))
kable(model_itt,
digits = 3)
term | estimate | std.error | statistic | p.value | conf.low | conf.high | df | outcome |
---|---|---|---|---|---|---|---|---|
(Intercept) | 0.669 | 0.013 | 51.866 | 0 | 0.643 | 0.694 | 4020 | turnout |
Z | 0.132 | 0.015 | 8.783 | 0 | 0.102 | 0.161 | 4020 | turnout |
model_ittd <- tidy(lm_robust(D ~ Z, data = dat))
kable(model_ittd,
digits = 3)
term | estimate | std.error | statistic | p.value | conf.low | conf.high | df | outcome |
---|---|---|---|---|---|---|---|---|
(Intercept) | 0.000 | 0.000 | 5376.000 | 0 | 0.000 | 0.000 | 4022 | D |
Z | 0.886 | 0.006 | 144.474 | 0 | 0.874 | 0.898 | 4022 | D |
# CACE is a ratio of ITT/ITT_D
model_itt %>% filter(term=="Z") %>% pull(estimate) /
model_ittd %>% filter(term=="Z") %>% pull(estimate)
## [1] 0.148926
Alternatively, we can use iv_robust
which does a two-staged least squares (2SLS) regression and returns the complier average causal effect. iv_robust
uses the following formula: Outcome ~ Treatment Status | Treatment Assignment
.
More generally, iv_robust
follows the logic of an instrumental variable analysis and so the formula is: Outcome ~ Endogenous Regressor + Covariates | Exogenous Instrument + Covariates
. Here is the same estimate using iv_robust
:
# Approach 2: Use iv_robust
model_cace <- iv_robust(turnout ~ D | Z, data = dat) %>% tidy()
kable(model_cace,
digits = 3)
term | estimate | std.error | statistic | p.value | conf.low | conf.high | df | outcome |
---|---|---|---|---|---|---|---|---|
(Intercept) | 0.669 | 0.013 | 51.866 | 0 | 0.643 | 0.694 | 4020 | turnout |
D | 0.149 | 0.017 | 8.774 | 0 | 0.116 | 0.182 | 4020 | turnout |
Concept: Placebo designs aim to identify compliers in the control group by closely mimicking the administration of treatment but not actually treating anyone in that group. For example, in a canvassing experiment, compliers are people who will respond to a door knock and listen to the canvasser’s message when assigned to the treatment group, and will not open their door and listen to the message when assigned to the control group. The problem is that we do not knock on doors in the control group, so we do not know which people would open their door, were they hypothetically assigned to the treatment group. A placebo design addresses this by having canvassers knock on doors in the control group and deliver some other message that does not affect outcomes. Because of random assignment, people that open the door in the control group are in expectation similar to people that open the door in the treatment group. Both comply with the research design. A random subset of compliers get treated, the remaining subset do not. We can compare these two groups to estimate the complier average causal effect.
In placebo designs, we assign subjects to one of three conditions. In the control condition (\(Z_i=0\)), both compliers and never takers reveal their untreated potential outcomes and we cannot identify the subject’s type. In the placebo (\(Z_i=1\)) and treatment (\(Z_i=2\)) conditions, we can identify types (compliers and never takers). The move in such designs is to “screen out” never takers and “eliminate the noise generated by [their] presence in both treatment and control groups”. Typically then, such designs produce two estimates of the CACE:
Method 1: \(\hat{CACE} = \frac{\hat{ITT_Y}}{\hat{ITT_D}} = \frac{E[Y_i|Z=2] - E[Y_i|Z=0]}{E[D_i|Z=2] - E[D_i|Z=0]}\)
Method 2: \(\hat{CACE} = E[Y_i(Z=2,D=1)|C] - E[Y_i(Z=1,D=0)|C]\)
Note: (1) The difference-in-means estimator used in Method 2 is unbiased, while the ratio estimator used in Method 1 is biased. Both estimators are consistent. (2) Placebo designs are “over-identified” because the same data produce two estimates of the complier average causal effect.