Notes from Week V

Manipulating Compliance

Suppose that we run an experiment in which some units are assigned to control (\(Z=0\)), others to a condition in which minimal effort is made to treat units (\(Z=1\)), and a third condition in which maximal effort is made to treat units (\(Z=2\)). In essence, we are manipulating compliance rates, and want to compare treatment effects for different types of compliers. Under one-sided non-compliance, we can define types and estimands.

Defining Types

Type Z=0 Z=1 Z=2
Never Takers 0 0 0
Minimal Compliers 0 1 1
Maximal Compliers 0 0 1

Estimands

  • To estimate treatment effect on all compliers, compare the second treatment group to the control group:

\(E[Y_i | Z = 2] - E[Y_i | Z = 0]\)

\(= E[Y_i(0) | Z=2, NT]\pi_{NT} + E[Y_i(1) | Z=2, Min]\pi_{Min} + E[Y_i(1) | Z=2, Max]\pi_{Max}\)

\(- \{E[Y_i(0) | Z=0, NT]\pi_{NT} - E[Y_i(0) | Z=0, Min]\pi_{Min} - E[Y_i(0) | Z=0, Max]\pi_{Max}\}\)

Because of random assignment, each experimental group is a “random subset” of the population with similar proportions of each type (\(\pi\)’s). Due to the excludability assumption, \(E[Y_i(0) | Z=2, NT] = E[Y_i(0) | Z=0, NT]\). This is because potential outcomes respond to \(d\), not \(z\), and \(d(z=2) = d(z=0) = 0\) for never takers.

This then simplifies to:

\(E[Y_i(Z=2,D=1) - Y_i(Z=0,D=0) | Min]\pi_{Min} + E[Y_i(Z=2,D=1) - Y_i(Z=0,D=0) | Max]\pi_{Max}\)

\(= E[Y_i(1) - Y_i(0) | Complier]\pi_{Min+Max}\)

Thus: \(E[Y_i | Z = 2] - E[Y_i | Z = 0] = CACE \pi_{C}\)

And \(CACE = \frac{E[Y_i | Z = 2] - E[Y_i | Z = 0]}{\pi_C}\)

  • To estimate the treatment effect on minimal compliers, compare the first treatment group to the control group:

\(E[Y_i | Z = 1] - E[Y_i | Z = 0]\)

\(= E[Y_i(0) | Z=1, NT]\pi_{NT} + E[Y_i(1) | Z=1, Min]\pi_{Min} + E[Y_i(0) | Z=1, Max]\pi_{Max}\)

\(- \{E[Y_i(0) | Z=0, NT]\pi_{NT} + E[Y_i(0) | Z=0, Min]\pi_{Min} + E[Y_i(0) | Z=0, Max]\pi_{Max}\}\)

\(= E[Y_i(1) - Y_i(0)|Min]\pi_{Min}\)

And then: \(CACE_{Min} = \frac{E[Y_i | Z = 1] - E[Y_i | Z = 0]}{\pi_{Min}}\)

  • To estimate the treatment effct on maximal compliers, compare the second and first treatment groups:

\(E[Y_i | Z = 2] - E[Y_i | Z = 1]\)

\(= E[Y_i(0) | Z=2, NT]\pi_{NT} + E[Y_i(1) | Z=2, Min]\pi_{Min} + E[Y_i(1) | Z=2, Max]\pi_{Max}\)

\(- \{E[Y_i(0) | Z=1, NT]\pi_{NT} + E[Y_i(1) | Z=1, Min]\pi_{Min} + E[Y_i(0) | Z=1, Max]\pi_{Max}\}\)

\(= E[Y_i(1) - Y_i(0)|Max]\pi_{Max}\)

And then: \(CACE_{Max} = \frac{E[Y_i | Z = 2] - E[Y_i | Z = 1]}{\pi_{Max}}\)

Task

Using the table from Question 9, compute the CACE for minimal and maximal compliers.

Table: Gerber and Green 2012:169
Quantity Control Min_Effort Max_Effort
Percent reached by callers 0.00 29.97 47.31
Percent Voting 55.89 55.91 56.53
N 317182.00 7500.00 7500.00

\(CACE_{Min} = \frac{0.5591-0.5589}{0.2997} \approx 0.0007\)

\(CACE_{Max} = \frac{0.5653 - 0.5591}{0.4731 - 0.2997} \approx 0.0358\)

Control Mean Among Compliers

Consider now a variant of the above table. Can you estimate the mean outcome for compliers in the treatment and control groups, under one-sided non-compliance?

Table: Hypothetical Michigan Turnout Experiment
Quantity Control Treatment
Percent reached by callers 0.00 47.31
Turnout among those not contacted by canvassers 55.89 40.50
Overall turnout 55.89 56.53

Solution:

Lets start by re-writing the above information in a more intuitive way:

Quantity Control Treatment
Percent reached by callers 0 47.31
Turnout among Never Takers 40.5 40.5
Turnout among Compliers Don’t Know Don’t Know
Overall turnout 55.89 56.53

Step 1: To get the mean control outcome for compliers, write the control group mean as a weighted average of types:

\(E[Y_i | Z = 0] = E[Y_i | Z=0, NT]\pi_{NT} + E[Y_i | Z=0, C]\pi_{C}\)

\(0.5589 = 0.405\cdot(1 - 0.4731) + x\cdot(0.4731)\)

So \(x = \frac{0.5589 - (0.405 \cdot 0.5269)}{0.4731} = 0.7303\)

Step 2: To get the mean treated outcome for compliers, write the treatment group mean as a weighted average of types:

\(E[Y_i | Z = 1] = E[Y_i | Z=1, NT]\pi_{NT} + E[Y_i | Z=1, C]\pi_{C}\)

\(0.5653 = 0.405\cdot(1 - 0.4731) + x\cdot(0.4731)\)

Then \(x = \frac{0.5653 - (0.405 \cdot 0.5269)}{0.4731} = 0.7438\)

Step 3: The complier average causal effect can then be estimated in two ways:

\(CACE = E[Y_i | Z=1, Compliers] - E[Y_i | Z=0, Compliers] = 0.7438 - 0.7303 \approx 0.0135\)

Alternatively \(CACE = \frac{ITT_Y}{ITT_D} = \frac{0.5653 - 0.5589}{0.4731} \approx 0.0135\)

Properties of \(\frac{ITT_Y}{ITT_D}\)

Gerber and Green (2012) make several important points about the complier average causal effect (p147-9). Here, I discuss four of them in detail.

Changes in \(ITT_D\) and \(CACE\)

Since the \(CACE = \frac{ITT_Y}{ITT_D}\), a misconception may arise that if \(ITT_D\) increases, the CACE decreases because we are dividing \(ITT_Y\) by a bigger number. This is not the case because ‘increasing the share of compliers also change[s] the numerator [\(ITT_Y\)], depending on how these extra compliers respond to treatment’ (Gerber and Green 2012:147). This is demonstrated using a small dataset below.

Task: Estimate the complier average causal effect for the first four rows of the dataset, then for the full dataset. Does the compliance rate increase or decrease with the additional observation? How does the CACE change as a result of this?

Unit Y Z D
1 10 1 1
2 6 1 0
3 6 0 0
4 4 0 0
5 20 1 1

Solution:

Considering only the first four rows:

\(ITT_Y = \frac{16}{2} - \frac{10}{2} = 3\)

\(ITT_D = \frac{1}{2} - 0 = 0.5\)

Consequently, \(CACE = \frac{3}{0.5} = 6\)

Now consider the full dataset:

\(ITT_Y = \frac{36}{3} - \frac{10}{2} = 12 - 5 = 7\)

\(ITT_D = \frac{2}{3} - 0 = 0.667\)

And thus, \(CACE = \frac{7}{0.667} \approx 10.5\).

Bottomline: Both the compliance rate and CACE can increase simultaneously. In this example, \(ITT_D\) moved from 0.5 to 0.667; and the \(CACE\) from 6 to 10.5. This is because the additional complier reported a high outcome value that considerably increased the \(ITT_Y\).

\(\hat{CACE}\) is consistent but biased

We know that the complier average causal effect is a ratio of two quantities: \(\frac{ITT_Y}{ITT_D}\). Using a data sample, we estimate these two quantities: \(\hat{ITT_Y}\) and \(\hat{ITT_D}\). This ratio (\(\frac{\hat{ITT_Y}}{\hat{ITT_D}}\)) is a consistent but biased estimator of the true CACE.

Intuition: We know that \(\hat{ITT_Y}\) and \(\hat{ITT_D}\) are unbiased estimators of \(ITT_Y\) and \(ITT_D\). In other words: \(E[\hat{ITT_Y}] = ITT_Y\) and \(E[\hat{ITT_D}] = ITT_D\). However, ‘the ratio of two unbiased estimators is not an unbiased estimator for the ratio of the two estimands’ (Gerber and Green 2012:151):

\(E[\frac{\hat{ITT_Y}}{\hat{ITT_D}}] \neq \frac{E[\hat{ITT_Y}]}{E[\hat{ITT_D}]} = \frac{ITT_Y}{ITT_D}\).

This is because for any two random variables \(X\) and \(Y\), \(E[\frac{X}{Y}] = \frac{E[X] - Cov[\frac{X}{Y},Y]}{E[Y]}\). In our context:

\(E[\frac{\hat{ITT_Y}}{\hat{ITT_D}}] = \frac{E[\hat{ITT_Y}] - Cov[\frac{\hat{ITT_Y}}{\hat{ITT_D}},\hat{ITT_D}]}{E[\hat{ITT_D}]} = \frac{ITT_Y - Cov[\frac{\hat{ITT_Y}}{\hat{ITT_D}},\hat{ITT_D}]}{ITT_D}\)

Which can be stated as the true complier average causal effect plus a bias term:

\(\frac{ITT_Y}{ITT_D} - \color{red}{\frac{Cov[\frac{\hat{ITT_Y}}{\hat{ITT_D}},\hat{ITT_D}]}{ITT_D}}\)

Exclusion restriction violations and \(ITT_D\)

Claim: When the \(ITT_D\) is close to zero, even a slight violation of the exclusion restriction may severely bias the estimation of the CACE (Gerber and Green 2012:149)

Proof: Write the ITT as a weighted average of compliers and never takers

\(ITT_Y = E[Y_i(z=1,d=1) - Y_i(z=0,d=0) |C] \cdot ITT_D + E[Y_i(z=1,d=0) - Y_i(z=0,d=0) | NT]\cdot (1 - ITT_D)\)

Now, if we divide the entire equation by \(ITT_D\):

\(\frac{ITT_Y}{ITT_D} = CACE + \color{red}{\frac{1 - ITT_D}{ITT_D} E[Y_i(z=1,d=0) - Y_i(z=0,d=0) | NT]}\)

Note that if the exclusion restriction holds, \(E[Y_i(z=1,d=0)] = E[Y_i(z=0,d=0)]\). But even a slight violation can produce a large bias when \(ITT_D\) is small. As \(ITT_D \rightarrow 0, \frac{1-ITT_D}{ITT_D} \rightarrow \infty\).

What happens when there are defiers?

Consider a situation in which there is two-sided non-compliance, and we have defiers. How does the CACE theorem breakdown in such cases?

  • \(ITT_Y \neq (ATE|C) \cdot \pi_C\)

Write the ITT as a weighted-average of types:

\(E[Y|Z=1] - E[Y|Z=0]\)

\(= E[Y_i(z=1,d=0) - Y_i(z=0,d=0) | NT]\pi_{NT}\)

\(+ E[Y_i(z=1,d=1) - Y_i(z=0,d=1) | AT]\pi_{AT}\)

\(+ E[Y_i(z=1,d=1) - Y_i(z=0,d=0) | C]\pi_{C}\)

\(+ E[Y_i(z=1,d=0) - Y_i(z=0,d=1) | D]\pi_{D}\)

If the exclusion restriction holds, the first two terms equal zero. But if \(\pi_D \neq 0\), then \(ITT = (ATE | C)\pi_{C} - (ATE| D)\pi_{D}\).

Note: \((ATE|D) = E[Y_i(d=1) - Y_i(d=0) | D]\), and the ITT term has \(E[Y_i(d=0) - Y_i(d=1) | D]\) which equals to \(-(ATE|D)\). Hence the minus sign in the final expression: \(ITT = (ATE | C)\pi_{C} - (ATE| D)\pi_{D}\)

  • \(ITT_D \neq \pi_C\)

Furthermore, when there are defiers, \(ITT_D\) no longer measures the proportion of compliers (\(\pi_C\)):

\(ITT_D = E[D|Z=1] - E[D|Z=0] = (\pi_{C} + \pi_{AT}) - (\pi_{AT} + \pi_{D}) = \pi_{C} - \pi_{D}\)

  • Summary

When there are defiers, the ratio \(\frac{ITT_Y}{ITT_D} = \frac{(ATE | C)\pi_{C} - (ATE| D)\pi_{D}}{\pi_{C} - \pi_{D}}\).

Monotonicity and Subject Types

As we have seen in two-sided noncompliance, the monotonicity assumption (\(d_i(1) \geq d_i(0)\)) rules out defiers. I will now demonstrate how such an assumption restricts the subject type space in other cases.

Setup: Consider a case in which there are two types of treatment assignment \(Z= \{0,1\}\); but treatment status \(D\) is three-tiered \(D = \{0,1,2\}\). For example, we have an encouragement design in which subjects are either incentivized to watch a non-political television show (\(Z=0\)), or a political program (\(Z=1\)). For simplicity, let there be three forms of treatment: subjects watch a non-political show (\(D=0\)), mayoral debate (\(D=1\)), or the news (\(D=2\)).

Question: How many types of subjects are there with two treatment assignments and three forms of actual treatment?

Answer: \(3^2 = 9\) types. More generally there are \((No. of Treatments)^{(No. of Assignments)}\) types. Here they are:

Type D(Z=0) D(Z=1)
1 0 0
2 0 1
3 1 0
4 1 1
5 2 0
6 0 2
7 1 2
8 2 1
9 2 2

Now consider a monotonicity stipulation: \(d_i(z=1) \geq d_i(z=0)\). What types are ruled out as a conseqence of this?

Type D(Z=0) D(Z=1) Ruled Out
1 0 0 -
2 0 1 -
3 1 0 X
4 1 1 -
5 2 0 X
6 0 2 -
7 1 2 -
8 2 1 X