Notes from Week V

Manipulating Compliance

Suppose that we run an experiment in which some units are assigned to control (\(Z=0\)), others to a condition in which minimal effort is made to treat units (\(Z=1\)), and a third condition in which maximal effort is made to treat units (\(Z=2\)). In essence, we are manipulating compliance rates, and want to compare treatment effects for different types of compliers. Under one-sided non-compliance, we can define types and estimands.

Defining Types

Type Z=0 Z=1 Z=2
Never Takers 0 0 0
Minimal Compliers 0 1 1
Maximal Compliers 0 0 1

Estimands

  • To estimate treatment effect on all compliers, compare the second treatment group to the control group:

\(E[Y_i | Z = 2] - E[Y_i | Z = 0]\)

\(= E[Y_i(0) | Z=2, NT]\pi_{NT} + E[Y_i(1) | Z=2, Min]\pi_{Min} + E[Y_i(1) | Z=2, Max]\pi_{Max}\)

\(- \{E[Y_i(0) | Z=0, NT]\pi_{NT} - E[Y_i(0) | Z=0, Min]\pi_{Min} - E[Y_i(0) | Z=0, Max]\pi_{Max}\}\)

Because of random assignment, each experimental group is a “random subset” of the population with similar proportions of each type (\(\pi\)’s). Due to the excludability assumption, \(E[Y_i(0) | Z=2, NT] = E[Y_i(0) | Z=0, NT]\). This is because potential outcomes respond to \(d\), not \(z\), and \(d(z=2) = d(z=0) = 0\) for never takers.

This then simplifies to:

\(E[Y_i(Z=2,D=1) - Y_i(Z=0,D=0) | Min]\pi_{Min} + E[Y_i(Z=2,D=1) - Y_i(Z=0,D=0) | Max]\pi_{Max}\)

\(= E[Y_i(1) - Y_i(0) | Complier]\pi_{Min+Max}\)

Thus: \(E[Y_i | Z = 2] - E[Y_i | Z = 0] = CACE \pi_{C}\)

And \(CACE = \frac{E[Y_i | Z = 2] - E[Y_i | Z = 0]}{\pi_C}\)

  • To estimate the treatment effect on minimal compliers, compare the first treatment group to the control group:

\(E[Y_i | Z = 1] - E[Y_i | Z = 0]\)

\(= E[Y_i(0) | Z=1, NT]\pi_{NT} + E[Y_i(1) | Z=1, Min]\pi_{Min} + E[Y_i(0) | Z=1, Max]\pi_{Max}\)

\(- \{E[Y_i(0) | Z=0, NT]\pi_{NT} + E[Y_i(0) | Z=0, Min]\pi_{Min} + E[Y_i(0) | Z=0, Max]\pi_{Max}\}\)

\(= E[Y_i(1) - Y_i(0)|Min]\pi_{Min}\)

And then: \(CACE_{Min} = \frac{E[Y_i | Z = 1] - E[Y_i | Z = 0]}{\pi_{Min}}\)

  • To estimate the treatment effct on maximal compliers, compare the second and first treatment groups:

\(E[Y_i | Z = 2] - E[Y_i | Z = 1]\)

\(= E[Y_i(0) | Z=2, NT]\pi_{NT} + E[Y_i(1) | Z=2, Min]\pi_{Min} + E[Y_i(1) | Z=2, Max]\pi_{Max}\)

\(- \{E[Y_i(0) | Z=1, NT]\pi_{NT} + E[Y_i(1) | Z=1, Min]\pi_{Min} + E[Y_i(0) | Z=1, Max]\pi_{Max}\}\)

\(= E[Y_i(1) - Y_i(0)|Max]\pi_{Max}\)

And then: \(CACE_{Max} = \frac{E[Y_i | Z = 2] - E[Y_i | Z = 1]}{\pi_{Max}}\)

Task

Using the table from Question 9, compute the CACE for minimal and maximal compliers.

Table: Gerber and Green 2012:169
Quantity Control Min_Effort Max_Effort
Percent reached by callers 0.00 29.97 47.31
Percent Voting 55.89 55.91 56.53
N 317182.00 7500.00 7500.00

\(CACE_{Min} = \frac{0.5591-0.5589}{0.2997} \approx 0.0007\)

\(CACE_{Max} = \frac{0.5653 - 0.5591}{0.4731 - 0.2997} \approx 0.0358\)

Control Mean Among Compliers

Consider now a variant of the above table. Can you estimate the mean outcome for compliers in the treatment and control groups, under one-sided non-compliance?

Table: Hypothetical Michigan Turnout Experiment
Quantity Control Treatment
Percent reached by callers 0.00 47.31
Turnout among those not contacted by canvassers 55.89 40.50
Overall turnout 55.89 56.53

Solution:

Lets start by re-writing the above information in a more intuitive way:

Quantity Control Treatment
Percent reached by callers 0 47.31
Turnout among Never Takers 40.5 40.5
Turnout among Compliers Don’t Know Don’t Know
Overall turnout 55.89 56.53

Step 1: To get the mean control outcome for compliers, write the control group mean as a weighted average of types:

\(E[Y_i | Z = 0] = E[Y_i | Z=0, NT]\pi_{NT} + E[Y_i | Z=0, C]\pi_{C}\)

\(0.5589 = 0.405\cdot(1 - 0.4731) + x\cdot(0.4731)\)

So \(x = \frac{0.5589 - (0.405 \cdot 0.5269)}{0.4731} = 0.7303\)

Step 2: To get the mean treated outcome for compliers, write the treatment group mean as a weighted average of types:

\(E[Y_i | Z = 1] = E[Y_i | Z=1, NT]\pi_{NT} + E[Y_i | Z=1, C]\pi_{C}\)

\(0.5653 = 0.405\cdot(1 - 0.4731) + x\cdot(0.4731)\)

Then \(x = \frac{0.5653 - (0.405 \cdot 0.5269)}{0.4731} = 0.7438\)

Step 3: The complier average causal effect can then be estimated in two ways:

\(CACE = E[Y_i | Z=1, Compliers] - E[Y_i | Z=0, Compliers] = 0.7438 - 0.7303 \approx 0.0135\)

Alternatively \(CACE = \frac{ITT_Y}{ITT_D} = \frac{0.5653 - 0.5589}{0.4731} \approx 0.0135\)

Properties of \(\frac{ITT_Y}{ITT_D}\)

Gerber and Green (2012) make several important points about the complier average causal effect (p147-9). Here, I discuss four of them in detail.

Changes in \(ITT_D\) and \(CACE\)

Since the \(CACE = \frac{ITT_Y}{ITT_D}\), a misconception may arise that if \(ITT_D\) increases, the CACE decreases because we are dividing \(ITT_Y\) by a bigger number. This is not the case because ‘increasing the share of compliers also change[s] the numerator [\(ITT_Y\)], depending on how these extra compliers respond to treatment’ (Gerber and Green 2012:147). This is demonstrated using a small dataset below.

Task: Estimate the complier average causal effect for the first four rows of the dataset, then for the full dataset. Does the compliance rate increase or decrease with the additional observation? How does the CACE change as a result of this?

Unit Y Z D
1 10 1 1
2 6 1 0
3 6 0 0
4 4 0 0
5 20 1 1

Solution:

Considering only the first four rows:

\(ITT_Y = \frac{16}{2} - \frac{10}{2} = 3\)

\(ITT_D = \frac{1}{2} - 0 = 0.5\)

Consequently, \(CACE = \frac{3}{0.5} = 6\)

Now consider the full dataset:

\(ITT_Y = \frac{36}{3} - \frac{10}{2} = 12 - 5 = 7\)

\(ITT_D = \frac{2}{3} - 0 = 0.667\)

And thus, \(CACE = \frac{7}{0.667} \approx 10.5\).

Bottomline: Both the compliance rate and CACE can increase simultaneously. In this example, \(ITT_D\) moved from 0.5 to 0.667; and the \(CACE\) from 6 to 10.5. This is because the additional complier reported a high outcome value that considerably increased the \(ITT_Y\).

\(\hat{CACE}\) is consistent but biased

We know that the complier average causal effect is a ratio of two quantities: \(\frac{ITT_Y}{ITT_D}\). Using a data sample, we estimate these two quantities: \(\hat{ITT_Y}\) and \(\hat{ITT_D}\). This ratio (\(\frac{\hat{ITT_Y}}{\hat{ITT_D}}\)) is a consistent but biased estimator of the true CACE.

Intuition: We know that \(\hat{ITT_Y}\) and \(\hat{ITT_D}\) are unbiased estimators of \(ITT_Y\) and \(ITT_D\). In other words: \(E[\hat{ITT_Y}] = ITT_Y\) and \(E[\hat{ITT_D}] = ITT_D\). However, ‘the ratio of two unbiased estimators is not an unbiased estimator for the ratio of the two estimands’ (Gerber and Green 2012:151):

\(E[\frac{\hat{ITT_Y}}{\hat{ITT_D}}] \neq \frac{E[\hat{ITT_Y}]}{E[\hat{ITT_D}]} = \frac{ITT_Y}{ITT_D}\).

This is because for any two random variables \(X\) and \(Y\), \(E[\frac{X}{Y}] = \frac{E[X] - Cov[\frac{X}{Y},Y]}{E[Y]}\). In our context:

\(E[\frac{\hat{ITT_Y}}{\hat{ITT_D}}] = \frac{E[\hat{ITT_Y}] - Cov[\frac{\hat{ITT_Y}}{\hat{ITT_D}},\hat{ITT_D}]}{E[\hat{ITT_D}]} = \frac{ITT_Y - Cov[\frac{\hat{ITT_Y}}{\hat{ITT_D}},\hat{ITT_D}]}{ITT_D}\)

Which can be stated as the true complier average causal effect plus a bias term:

\(\frac{ITT_Y}{ITT_D} - \color{red}{\frac{Cov[\frac{\hat{ITT_Y}}{\hat{ITT_D}},\hat{ITT_D}]}{ITT_D}}\)

Exclusion restriction violations and \(ITT_D\)

Claim: When the \(ITT_D\) is close to zero, even a slight violation of the exclusion restriction may severely bias the estimation of the CACE (Gerber and Green 2012:149)

Proof: Write the ITT as a weighted average of compliers and never takers

\(ITT_Y = E[Y_i(z=1,d=1) - Y_i(z=0,d=0) |C] \cdot ITT_D + E[Y_i(z=1,d=0) - Y_i(z=0,d=0) | NT]\cdot (1 - ITT_D)\)

Now, if we divide the entire equation by \(ITT_D\):

\(\frac{ITT_Y}{ITT_D} = CACE + \color{red}{\frac{1 - ITT_D}{ITT_D} E[Y_i(z=1,d=0) - Y_i(z=0,d=0) | NT]}\)

Note that if the exclusion restriction holds, \(E[Y_i(z=1,d=0)] = E[Y_i(z=0,d=0)]\). But even a slight violation can produce a large bias when \(ITT_D\) is small. As \(ITT_D \rightarrow 0, \frac{1-ITT_D}{ITT_D} \rightarrow \infty\).

What happens when there are defiers?

Consider a situation in which there is two-sided non-compliance, and we have defiers. How does the CACE theorem breakdown in such cases?

  • \(ITT_Y \neq (ATE|C) \cdot \pi_C\)

Write the ITT as a weighted-average of types:

\(E[Y|Z=1] - E[Y|Z=0]\)

\(= E[Y_i(z=1,d=0) - Y_i(z=0,d=0) | NT]\pi_{NT}\)

\(+ E[Y_i(z=1,d=1) - Y_i(z=0,d=1) | AT]\pi_{AT}\)

\(+ E[Y_i(z=1,d=1) - Y_i(z=0,d=0) | C]\pi_{C}\)

\(+ E[Y_i(z=1,d=0) - Y_i(z=0,d=1) | D]\pi_{D}\)

If the exclusion restriction holds, the first two terms equal zero. But if \(\pi_D \neq 0\), then \(ITT = (ATE | C)\pi_{C} - (ATE| D)\pi_{D}\).

Note: \((ATE|D) = E[Y_i(d=1) - Y_i(d=0) | D]\), and the ITT term has \(E[Y_i(d=0) - Y_i(d=1) | D]\) which equals to \(-(ATE|D)\). Hence the minus sign in the final expression: \(ITT = (ATE | C)\pi_{C} - (ATE| D)\pi_{D}\)

  • \(ITT_D \neq \pi_C\)

Furthermore, when there are defiers, \(ITT_D\) no longer measures the proportion of compliers (\(\pi_C\)):

\(ITT_D = E[D|Z=1] - E[D|Z=0] = (\pi_{C} + \pi_{AT}) - (\pi_{AT} + \pi_{D}) = \pi_{C} - \pi_{D}\)

  • Summary

When there are defiers, the ratio \(\frac{ITT_Y}{ITT_D} = \frac{(ATE | C)\pi_{C} - (ATE| D)\pi_{D}}{\pi_{C} - \pi_{D}}\).

Monotonicity and Subject Types

As we have seen in two-sided noncompliance, the monotonicity assumption (\(d_i(1) \geq d_i(0)\)) rules out defiers. I will now demonstrate how such an assumption restricts the subject type space in other cases.

Setup: Consider a case in which there are two types of treatment assignment \(Z= \{0,1\}\); but treatment status \(D\) is three-tiered \(D = \{0,1,2\}\). For example, we have an encouragement design in which subjects are either incentivized to watch a non-political television show (\(Z=0\)), or a political program (\(Z=1\)). For simplicity, let there be three forms of treatment: subjects watch a non-political show (\(D=0\)), mayoral debate (\(D=1\)), or the news (\(D=2\)).

Question: How many types of subjects are there with two treatment assignments and three forms of actual treatment?

Answer: \(3^2 = 9\) types. More generally there are \((No. of Treatments)^{(No. of Assignments)}\) types. Here they are:

Type D(Z=0) D(Z=1)
1 0 0
2 0 1
3 1 0
4 1 1
5 2 0
6 0 2
7 1 2
8 2 1
9 2 2

Now consider a monotonicity stipulation: \(d_i(z=1) \geq d_i(z=0)\). What types are ruled out as a conseqence of this?

Type D(Z=0) D(Z=1) Ruled Out
1 0 0 -
2 0 1 -
3 1 0 X
4 1 1 -
5 2 0 X
6 0 2 -
7 1 2 -
8 2 1 X
9 2 2 -

Estimating CACE using ivreg

We will use ivreg from the AER package to compute the complier average causal effect. For this part, I use the Mullainathan, Washington, and Azari (2010) dataset. This is an experiment in which 1,000 subjects were randomly assigned to a treatment condition (encouragement to watch the mayoral debate), or a control condition in which they were encouraged to watch some non-political show. Note that we encounter two-sided non-compliance: some respondents would watch the debate irrespective of encouragement (“always takers”), and some would not watch these debates irrespective of their assignment status (“never takers”). It is reasonable to believe there are no “defiers”: respondents who would not watch the mayoral debate when encouraged to do so, and watch it when encouraged to watch non-political shows.

Let \(Z\) be the assignment status (\(Z=1\) if encouraged to watch the mayoral debate), \(D\) be the treatment (\(D=1\) if a respondent watches the debate), and \(Y\) be the change in their view of candidates.

# Download the dataset (http://hdl.handle.net/10079/kh189dd)
dat5 <- read.csv("W5_MayoralDebates.csv")
kable(head(dat5))
Z D Y
0 0 0
0 0 0
1 0 0
1 1 1
1 1 0
0 0 1
# Approach 1: Estimate CACE by separately computing ITT and ITT_D

model_itt <- tidy(lm_robust(Y ~ Z, data = dat5))
kable(model_itt)
coefficient_name coefficients se p ci_lower ci_upper df outcome
(Intercept) 0.4181818 0.0221928 0.0000000 0.3746318 0.4617318 998 Y
Z 0.0570657 0.0314219 0.0696533 -0.0045949 0.1187263 998 Y
model_ittd <- tidy(lm_robust(D ~ Z, data = dat5))
kable(model_ittd)
coefficient_name coefficients se p ci_lower ci_upper df outcome
(Intercept) 0.1616162 0.0165615 0 0.1291168 0.1941156 998 D
Z 0.2047205 0.0271084 0 0.1515244 0.2579166 998 D
# CACE is a ratio of ITT/ITT_D
model_itt$coefficients[2]/model_ittd$coefficients[2]
## [1] 0.2787494
# Approach 2: Use ivreg to do 2SLS in one step
library(AER)

model_cace <- ivreg(Y ~ D | Z, data = dat5)

# Note that ivreg takes a formula of the type: Outcome ~ Endogenous
# Regressor + Covariates | Exogenous Instrument + Covariates

summary(model_cace)
## 
## Call:
## ivreg(formula = Y ~ D | Z, data = dat5)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.6519 -0.3731 -0.3731  0.6269  0.6269 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.37313    0.04346   8.585   <2e-16 ***
## D            0.27875    0.15299   1.822   0.0688 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4952 on 998 degrees of freedom
## Multiple R-Squared: 0.00992, Adjusted R-squared: 0.008928 
## Wald test:  3.32 on 1 and 998 DF,  p-value: 0.06876

Note: The point estimate for the CACE is 0.2787, and the 95% confidence interval is \(\hat{CACE} \pm1.96 \cdot 0.15299\), that is: \([-0.0211, 0.5785]\).

Placebo Designs

  • These are three-arm experiments that work in two steps: we identify compliers, then randomly assign them to a treatment or placebo condition.
Two-Stage Placebo Design

Two-Stage Placebo Design

  • In practice, we assign subjects to one of three conditions. In the control condition, both compliers and never takers reveal their untreated potential outcomes and we cannot identify the subject’s type. In the placebo and treatment conditions, we can identify types (compliers and never takers). The move in such designs is to “screen out” never takers and “eliminate the noise generated by [their] presence in both treatment and control groups”. Typically then, such designs produce two estimates of the CACE:

Method 1: \(\hat{CACE} = \frac{\hat{ITT_Y}}{\hat{ITT_D}} = \frac{E[Y_i|Z=2] - E[Y_i|Z=0]}{E[D_i|Z=2] - E[D_i|Z=0]}\)

Method 2: \(\hat{CACE} = E[Y_i(Z=2,D=1)|C] - E[Y_i(Z=1,D=0)|C]\)

Note: (1) The difference-in-means estimator used in Method 2 is unbiased, while the ratio estimator used in Method 1 is biased. Both estimators are consistent. (2) Placebo designs are “over-identified” because the same data produce two estimates of the complier average causal effect.

Method 2: Identifying Compliers and the CACE

Method 2: Identifying Compliers and the CACE