Concepts

Concept: Compliance refers to whether actual treatment coincides with assigned treatment (Gerber and Green 2012:131). One-sided non compliance is a case of failure to treat: no one in the control group gets treated and some people in the treatment group do not get treated.

Subject Types and Potential Outcomes

Let \(d_i(z)\) be a potential outcome that tells us person \(i\)’s treatment status \(d_i\) when their treatment assignment is \(z_i\). For example, \(d_i(1) = 0\) refers to person \(i\) who has been assigned to treatment (\(z_i=1\)) but is not treated (\(d_i = 0\)).

We can now think of four types of subjects in any experiment:

Source: Gerber and Green (2012)


Theorem 5.1

Theorem 5.1 shows that CACE (complier average causal effect or the average treatment effect among compliers) is a ratio of two intent-to-treat parameters: the effect of treatment assignment on the outcome \(Y_i\) (called \(ITT\) or \(ITT_Y\)); and the effect of treatment assignment on treatment status \(D_i\) (called \(ITT_D\)). More specifically, \(CACE = \frac{ITT}{ITT_D}\)

Proof: Start with the ITT or intent-to-treat effect:

\(\underbrace{E[Y_i | z_i = 1]}_{\text{Avg Outcome in Trt Group}} - \underbrace{E[Y_i | z_i = 0]}_{\text{Avg Outcome in Control Group}}\)

We can rewrite each of these quantities as a weighted average of potential outcomes of different types of subjects:

In the treatment group:

\(E[Y_i | z_i = 1] = E[Y_i | z_i = 1, d_i = 1]\cdot \pi_C + E[Y_i | z_i = 1, d_i = 0]\cdot(1 - \pi_C)\)

where \(\pi_C\) is the proportion of compliers in the treatment group. Since we assume there are only two types of subjects, \((1 - \pi_C)\) is the proportion of never takers in the treatment group.

We know that compliers reveal their treated potential outcome or \(Y_i(1)\) in the treatment group, and never takers reveal their untreated potential outcome or \(Y_i(0)\) in the treatment group. So we can simplify the above expression to read:

\(E[Y_i | z_i = 1] = E[Y_i(1) | \text{Complier}]\cdot \pi_C + E[Y_i(0) | \text{NT}]\cdot(1 - \pi_C)\)

In the control group:

\(E[Y_i | z_i = 0] = E[Y_i | z_i = 0, d_i = 0]\cdot \pi_C + E[Y_i | z_i = 0, d_i = 0]\cdot(1 - \pi_C)\)

In the control group, both compliers and never takers reveal their untreated potential outcome. So we can simplify this expression to read:

\(E[Y_i | z_i = 0] = E[Y_i(0) | \text{Complier}]\cdot \pi_C + E[Y_i(0) | \text{NT}]\cdot(1 - \pi_C)\)

Plugging-in these quantities

The \(ITT\) can be written as a weighted average of types:

\(E[Y_i | z_i = 1] - E[Y_i | z_i = 0] = E[Y_i(1) - Y_i(0) |\text{Complier}]\cdot \pi_C + E[Y_i(0) - Y_i(0) | \text{NT}]\cdot(1 - \pi_C)\)

Since we randomly assigned subjects to treatment groups, and assume excludability (treatment assignment (\(z_i\)) has no effect on outcomes other than through the treatment (\(d_i\)) itself), never-takers in the treatment and control groups have the same outcome on average. In other words, \(E[Y_i | z_i = 1, d_i = 0] = E[Y_i | z_i = 0, d_i = 0]\). The second term cancels out and we are left with:

\(E[Y_i | z_i = 1] - E[Y_i | z_i = 0] = E[Y_i(1) - Y_i(0) |\text{Complier}]\cdot \pi_C\)

Re-arranging terms we get:

\(E[Y_i(1) - Y_i(0) |\text{Complier}] = \frac{E[Y_i | z_i = 1] - E[Y_i | z_i = 0]}{\pi_C}\)

Note that \(ITT_D = E[d_i | z_i = 1] - E[d_i | z_i = 0] = \pi_C\) (i.e. the average value of \(d_i\) in the treatment group is just the proportion of compliers, and the average value of \(d_i\) in the control group is 0 since no one is treated, so the difference is just the proportion of compliers or \(\pi_C\)).

Theorem 5.1 states exactly this: \(CACE = \frac{ITT}{ITT_D} = \frac{E[Y_i | z_i = 1] - E[Y_i | z_i = 0]}{\pi_C}\)


Assumptions

Identification of the CACE requires the following four assumptions: random assignment, non-interference, excludability, and \(ITT_D > 0\).


Estimation

Applying the Formula

Exercise

Can you estimate the average treatment effect for compliers assuming one-sided non-compliance and using Theorem 5.1?

Table: Hypothetical Michigan Turnout Experiment
Quantity Control Treatment
Percent reached by callers 0.00 47.31
Turnout among those not contacted by canvassers 55.89 40.50
Overall turnout 55.89 56.53

Solution:

Lets start by re-writing the above information in a more intuitive way:

Quantity Control Treatment
Percent reached by callers 0 47.31
Turnout among Never Takers 40.5 40.5
Turnout among Compliers Don’t Know Don’t Know
Overall turnout 55.89 56.53

Step 1: To get the average control outcome for compliers, write the control group mean as a weighted average of types:

\(E[Y_i | Z = 0] = E[Y_i | Z=0, NT](1 - \pi_{C}) + E[Y_i | Z=0, C]\pi_{C}\)

\(0.5589 = 0.405\cdot(1 - 0.4731) + x\cdot(0.4731)\)

So \(x = \frac{0.5589 - (0.405 \cdot 0.5269)}{0.4731} = 0.7303\)

Step 2: To get the mean treated outcome for compliers, write the treatment group mean as a weighted average of types:

\(E[Y_i | Z = 1] = E[Y_i | Z=1, NT](1-\pi_{C}) + E[Y_i | Z=1, C]\pi_{C}\)

\(0.5653 = 0.405\cdot(1 - 0.4731) + x\cdot(0.4731)\)

Then \(x = \frac{0.5653 - (0.405 \cdot 0.5269)}{0.4731} = 0.7438\)

Step 3: The complier average causal effect can then be estimated in two ways:

\(CACE = E[Y_i | Z=1, \text{Compliers}] - E[Y_i | Z=0, \text{Compliers}] = 0.7438 - 0.7303 \approx 0.0135\)

Alternatively \(CACE = \frac{ITT_Y}{ITT_D} = \frac{0.5653 - 0.5589}{0.4731} \approx 0.0135\)


Using ivrobust

We will use iv_robust from the estimatr package to compute the complier average causal effect. For this part, I use the Guan and Green (2006) dataset. This is an experiment in which 4024 students in Peking University were randomly assigned to a treatment condition (contacted by a canvasser who encouraged them to vote), or a control condition in which they were not contacted by anyone. .

Let \(Z\) be the treatment assignment:\(z_i=1\) if contacted by a canvasser, \(z_i=0\) if assigned to not be contacted by a canvasser. Let \(D\) be the treatment: \(d_i=1\) if a canvasser actually contacted the student, and \(d_i=0\) if the student was not contacted by a canvasser. The outcome, turnout equals 1 if the student votes, 0 if they do not.

Lets download the data set and look at compliance with treatment assignment in this study:

library(tidyverse)
library(estimatr)

# Download the data set
dat <- read_csv("W9_GuanGreen.csv") %>%
  mutate(
    Z = treat2,
    D = contact
  ) %>%
  select(-c(contact,treat2))

compliance <- dat %>% group_by(Z, D) %>% summarise(N = n())

kable(compliance,
      caption = "Compliance With Treatment Assignment",
      caption.above = TRUE)
Compliance With Treatment Assignment
Z D N
0 0 1334
1 0 307
1 1 2383

We can use lm_robust to calculate the ITT (\(Y \sim Z\)) and ITT_D (\(D \sim Z\)), then take a ratio of those quantities to get the CACE:

# Approach 1: Estimate CACE by separately computing ITT and ITT_D

model_itt <- tidy(lm_robust(turnout ~ Z, data = dat))

kable(model_itt,
      digits = 3)
term estimate std.error statistic p.value conf.low conf.high df outcome
(Intercept) 0.669 0.013 51.866 0 0.643 0.694 4020 turnout
Z 0.132 0.015 8.783 0 0.102 0.161 4020 turnout
model_ittd <- tidy(lm_robust(D ~ Z, data = dat))

kable(model_ittd,
      digits = 3)
term estimate std.error statistic p.value conf.low conf.high df outcome
(Intercept) 0.000 0.000 5376.000 0 0.000 0.000 4022 D
Z 0.886 0.006 144.474 0 0.874 0.898 4022 D
# CACE is a ratio of ITT/ITT_D
model_itt %>% filter(term=="Z") %>% pull(estimate) / 
  model_ittd %>% filter(term=="Z") %>% pull(estimate)
## [1] 0.148926

Alternatively, we can use iv_robust which does a two-staged least squares (2SLS) regression and returns the complier average causal effect. iv_robust uses the following formula: Outcome ~ Treatment Status | Treatment Assignment.

More generally, iv_robust follows the logic of an instrumental variable analysis and so the formula is: Outcome ~ Endogenous Regressor + Covariates | Exogenous Instrument + Covariates. Here is the same estimate using iv_robust:

# Approach 2: Use iv_robust

model_cace <- iv_robust(turnout ~ D | Z, data = dat) %>% tidy()

kable(model_cace,
      digits = 3)
term estimate std.error statistic p.value conf.low conf.high df outcome
(Intercept) 0.669 0.013 51.866 0 0.643 0.694 4020 turnout
D 0.149 0.017 8.774 0 0.116 0.182 4020 turnout

Review

Placebo Designs

Concept: Placebo designs aim to identify compliers in the control group by closely mimicking the administration of treatment but not actually treating anyone in that group. For example, in a canvassing experiment, compliers are people who will respond to a door knock and listen to the canvasser’s message when assigned to the treatment group, and will not open their door and listen to the message when assigned to the control group. The problem is that we do not knock on doors in the control group, so we do not know which people would open their door, were they hypothetically assigned to the treatment group. A placebo design addresses this by having canvassers knock on doors in the control group and deliver some other message that does not affect outcomes. Because of random assignment, people that open the door in the control group are in expectation similar to people that open the door in the treatment group. Both comply with the research design. A random subset of compliers get treated, the remaining subset do not. We can compare these two groups to estimate the complier average causal effect.

Two-Stage Placebo Design

In placebo designs, we assign subjects to one of three conditions. In the control condition (\(Z_i=0\)), both compliers and never takers reveal their untreated potential outcomes and we cannot identify the subject’s type. In the placebo (\(Z_i=1\)) and treatment (\(Z_i=2\)) conditions, we can identify types (compliers and never takers). The move in such designs is to “screen out” never takers and “eliminate the noise generated by [their] presence in both treatment and control groups”. Typically then, such designs produce two estimates of the CACE:

Method 1: \(\hat{CACE} = \frac{\hat{ITT_Y}}{\hat{ITT_D}} = \frac{E[Y_i|Z=2] - E[Y_i|Z=0]}{E[D_i|Z=2] - E[D_i|Z=0]}\)

Method 2: \(\hat{CACE} = E[Y_i(Z=2,D=1)|C] - E[Y_i(Z=1,D=0)|C]\)

Note: (1) The difference-in-means estimator used in Method 2 is unbiased, while the ratio estimator used in Method 1 is biased. Both estimators are consistent. (2) Placebo designs are “over-identified” because the same data produce two estimates of the complier average causal effect.

Method 2: Identifying Compliers and the CACE