Notes from Week VIII

Trimming Bounds

With binary outcome variables, my code for trimming bounds breaks because the quantile function does not correctly trim the data. To see this, lets attempt Question 8 (Chapter 7) using the section code:

library(tidyverse)
library(estimatr)

# Create the dataset using the table on page 246

data <- data.frame(Hispanic = c(rep(0, 106), rep(1, 111)), Y = c(rep(1, 50), 
    rep(0, 28), rep(NA, 28), rep(1, 68), rep(0, 26), rep(NA, 17)), Observed = c(rep(1, 
    50 + 28), rep(0, 28), rep(1, 68 + 26), rep(0, 17)))

# Calculate Q

Q = with(data, (mean(Observed[Hispanic == 1]) - mean(Observed[Hispanic == 0]))/mean(Observed[Hispanic == 
    1]))
Q
## [1] 0.1310719
# Subsetting to only observed data

observed_treatment <- data %>% filter(Hispanic == 1 & Observed == 1)

observed_control <- data %>% filter(Hispanic == 0 & Observed == 1)

mean(observed_control$Y)  # E[Y0 | AR]
## [1] 0.6410256
# Identify the cutoffs for the lower and upper bounds

quantile(observed_treatment$Y, Q)
## 13.10719% 
##         0
quantile(observed_treatment$Y, (1 - Q))
## 86.89281% 
##         1
# Earlier: Trimming-off the lowest and highest values

observed_treatment_high <- filter(observed_treatment, Y > quantile(Y, probs = Q))

observed_treatment_low <- filter(observed_treatment, Y < quantile(Y, probs = (1 - 
    Q)))

# Error: We overtrim because the quantile function does not work with binary
# variables

length(observed_treatment$Y)  #n = 94
## [1] 94
length(observed_treatment_high$Y)  # n should be 82 (removing .13*94 = 12 observations)
## [1] 68
# Thus, the bound estimate is wrong:
mean(observed_treatment_high$Y) - mean(observed_control$Y)
## [1] 0.3589744
# Instead: Arrange the treated observations in descending order

observed_treatment <- observed_treatment %>% arrange(desc(Y))

# Calculate the number of ITRs

Q * length(observed_treatment$Y)  # approx 12
## [1] 12.32075
observed_treatment_high <- observed_treatment[1:82, ]

observed_treatment_low <- observed_treatment[13:94, ]


# Upper bound
mean(observed_treatment_high$Y) - mean(observed_control$Y)
## [1] 0.1882427
# Lower bound
mean(observed_treatment_low$Y) - mean(observed_control$Y)
## [1] 0.04190119

Non-Interference Assumption

Non-interference is an implicit but important assumption for unbiased causal inference. Typically, the assumption is that subject \(i\)’s potential outcomes depend solely on \(i\)’s assignment (\(z_i\)) and treatment status (\(d_i\)), not on the the treatment assignment or status of any other subject \(j \neq i\). Formally, we state this as:

\(Y_i(\textbf{z},\textbf{d}) = Y_i(\textbf{z'},\textbf{d'})\) where \(z_i = z'_i\) and \(d_i = d'_i\)

Note that we have already assumed non-interference when we say that units have only two potential outcomes \(Y_i(1)\) and \(Y_i(0)\), or that the outcome they reveal is determined by the switching equation \(Y_i = Y_i(1)\cdot d_i + Y_i(0)\cdot[1-d_i]\).

So what happens if this assumption breaks down? In other words, \(i\)’s outcomes are sensitive to \(j\)’s treatment assignment or status? In this section, we will look at the three-step solution to spillovers:

  1. Defining an exposure model and potential outcomes.

  2. Writing an estimand in terms of the potential outcome model.

  3. Using a design that randomly samples from those potential outcomes.

Defining Potential Outcomes

When there is interference between units, the first step is to posit an exposure model and define potential outcomes. This entails specifying the types of interference, writing potential outcomes that reflect the underlying interactions between units, and “restabilizing” outcomes.

Example: In Camerer (1998)’s study, bets were placed on one of two horses running in the same race. For each pair \(i\) and \(j\), Camerer randomly selected one horse and placed two $500 bets on that horse before the start of the race. The outcome of interest is the change in total bets for each horse (i.e. the difference between total bets placed post-treatment and pre-treatment). Crucially, bets depend on the horse’s odds or the proportion of total bets placed on that horse. This means that horse \(i\)’s betting odds are sensitive to the bets placed on horse \(j\). This leads to an interference problem: if Camerer bet on horse \(i\) in some race, \(i\)’s treatment status affects \(j\)’s betting odds (and outcomes).

To resolve this issue, the first step is to define an exposure model:

Horse \(i\)’s potential outcomes are affected by: (a) \(i\)’s treatment status \(d_i\); and (b) \(j\)’s treatment status \(d_j\). Accordingly, horse \(i\)’s potential outcomes are not sensitive to the treatment status of any horse \(k\) in some other pair and race.

If we believe this exposure model, horse \(i\) has four possible outcomes:

\(Y_i(d_i = 1, d_j = 1)\)

\(Y_i(d_i = 1, d_j = 0)\)

\(Y_i(d_i = 0, d_j = 1)\)

\(Y_i(d_i = 0, d_j = 0)\)

Note that Camerer (1998)’s random assignment protocol (or matched-pair design) allow only one horse to be treated. Thus horse \(i\) never reveals \(Y_i(d_i = 1, d_j = 1)\) and \(Y_i(d_i = 0, d_j = 0)\) with positive probability.

Practice 1: Village Study

Consider Figure 8.1 in Gerber and Green (2012). There are six locations (labeled A-F), of which five are inhabited (labeled 1-5). Say a researcher randomly selects one village for treatment, and specifies the following exposure model:

Village \(x\)’s outcomes depend on two things: \(x\)’s treatment status (\(d_x\)), and the treatment status of its immediate neighbor.

Figure 1: Spatial Spillovers (Figure 8.1 in Gerber and Green 2012)

Figure 1: Spatial Spillovers (Figure 8.1 in Gerber and Green 2012)

What are the possible set of potential outcomes?

Answer: A village can have three possible outcomes: \(Y_{00}\) when it is untreated, and its immediate neighbors are also untreated; \(Y_{01}\) when the village is treated but its neighbors are untreated; and \(Y_{10}\) when the village is untreated but its neighbors are treated.

Practice 2: Classroom Study

Consider an experiment in which I randomly assign two colleagues in the classroom to treatment, and stipulate the following exposure model:

Student \(i\)’s outcomes depend on two things: (a) \(i\)’s treatment status \(d_i\); and (b) the treatment status of an immediate neighbor (defined here as someone sitting on a chair to \(i\)’s left or right).

Figure 2: Spatial Spillovers (Classroom Example)

Figure 2: Spatial Spillovers (Classroom Example)

Given this model, list \(i\)’s possible potential outcomes.

Answer: Each student can have seven possible outcomes. Let the potential outcomes be stated in the following format: \(Y_{L,i,R}\) (where “right” and “left” are defined from Student 1’s perspective). Then we have:

  • \(Y_{000}\) when \(i\) is untreated, and so are the immediate neighbors.

  • \(Y_{100}\) when \(i\) is untreated and so is the neighbor to the right, but the neighbor to the left is treated.

  • \(Y_{010}\) when \(i\) is treated, but the immediate neighbors are untreated.

  • \(Y_{001}\) when \(i\) is untreated and so is the neighbor to the left, but the neighbor to the right is treated.

  • \(Y_{011}\) when \(i\) is treated and so is the neighbor to the right, but the neighbor to the left is untreated.

  • \(Y_{110}\) when \(i\) is treated and so is the neighbor to the left, but the neighbor to the right is untreated.

  • \(Y_{101}\) when \(i\) is untreated but the neighbors are treated.

Note: Some subjects may not express certain potential outcomes. For instance, Student 1 never expresses \(Y_{001}\); Student 2 never reveals \(Y_{110}\) (or \(Y_{100}\)); and Student 5 never reports \(Y_{001}\) (or \(Y_{101}\)).

Practice 3: Sinclair, McConnell, and Green (2010)

Consider a canvassing experiment in which Sinclair, McConnell, and Green (2010) randomly assigned all, half, or none of the members of each nine-digit zip code to receive mail. For simplicity, assume there are only one-person households, and the following exposure model is stipulated:

Respondent \(i\)’s potential outcomes depend on: (a) \(i\)’s treatment status (whether she received the encouragement message or not); and (b) the proportion of \(i\)’s zipcode that is treated.

What are the possible potential outcomes in such an experiment?

Answer: There are four possible outcomes for each respondent: \(Y_{0,0}\) when \(i\) is untreated and no one in the zipcode is treated; \(Y_{0,0.5}\) when \(i\) is untreated but half the members in the zipcode are treated; \(Y_{1,0.5}\) when \(i\) is treated along with half the members in the zipcode; and \(Y_{1,1}\) when \(i\) is treated along with all other members in the zipcode. Note that \(Y_{1,0}\) and \(Y_{0,1}\) are not realized with positive probability because of the random assignment procedure. (In a completely untreated zipcode, no member can be treated so we never observe \(Y_{1,0}\); and in a completely treated zipcode, no member is untreated so we never observe \(Y_{0,1}\))

Defining Estimands

Once we have defined potential outcomes and regained “stability”, the next step is to write an estimand. This is the causal quantity or comparison of interest.

Camerer’s study: For any horse \(i\) we have two potential outcomes \(Y_{1,0}\) and \(Y_{0,1}\). Then a possible quantity of interest at the unit level can be: \(Y_{i1,0} - Y_{i0,1}\). The estimand then becomes: \(E[Y_{1,0}] - E[Y_{0,1}]\).

Practice 1: Village Study

We know that in the village spillover example there are three possible outcomes: \(Y_{00}\), \(Y_{10}\), and \(Y_{01}\). What estimands might we be interested in? Think about “direct” and “indirect” treatment effects.

Answer: For the direct effect, we have \(E[Y_{01} - Y_{00}]\). And for the spillover effect we have \(E[Y_{10} - Y_{00}]\).

Practice 2: Sinclair, McConnell, and Green (2010)

In the canvassing experiment, there are four possible outcomes: \(Y_{0,0}\), \(Y_{0,0.5}\), \(Y_{1,0.5}\) and \(Y_{1,1}\). What estimands can we write for this study?

Answer: For the direct effect we have only one possible estimand: \(E[Y_{1,0.5} - Y_{0,0.5}]\) Note that \(E[Y_{1,1} - Y_{0,1}]\) also captures the direct effect but is not estimable with this data because \(Y_{0,1}\) is never observed.

For indirect effects, we have two estimands: \(E[Y_{1,1} - Y_{1,0.5}]\) and \(E[Y_{0,0.5} - Y_{0,0}]\).

Properties of Estimands

In each of these cases, the estimator is unbiased when three assumptions are satisfied: random assignment, excludability, and non-interference. Crucially, the last assumption now means something different:

  • In Camerer (1998)’s study it implies that horse \(i\)’s outcomes are not affected by the treatment status of a horse in some other pair and race. They can, however, depend on horse \(j\)’s treatment status (if \(i\) and \(j\) are a pair)

  • In the village spillover example, it implies that village \(i\)’s potential outcomes are not affected by the treatment status of non-contiguous villages. For instance, Village 2’s potential outcomes are not affected by Village 4’s treatment status (i.e. there is no spillover between non-contiguous villages)

  • In the Sinclair, McConnell, and Green (2010) study, non-interference implies that potential outcomes solely depend on \(i\)’s treatment status, and the treatment status of others in the same zipcode. However, the treatment status of those outside \(i\)’s zipcode have no impact (i.e. there is no spillover across zipcodes).

Estimating Effects

Once we have stable potential outcomes and well-defined estimands, we can proceed to estimate treatment effects. In doing so, we need to keep two things in mind:

  1. Exclude observations that: (a) do not express any of the potential outcomes in an estimand, given their assignment status; and (b) have undefined potential outcomes for the estimand.

  2. Subjects may have different probabilities of expressing a potential outcome. This is particularly the case with spillover outcomes in spatial settings: central nodes (or units in dense clusters) are more likely to express a spillover outcome than outlying nodes (or peripheral units). To deal with this we need to estimate every unit’s probability of expressing each type of potential outcome, then apply inverse probability weighting. In small datasets (such as the village study), these probabilities can be computed by hand. In larger datasets (such as Table 8.4), we need to simulate a large number of random assignments to estimate these probabilities.

Practice 1: Village Study

Step 1: Estimate each village’s probability of expressing \(Y_{00}\), \(Y_{01}\) and \(Y_{10}\). (We will work through this on the board in section)

Table 1: Adapted from Table 8.3, Gerber and Green (2012)

Table 1: Adapted from Table 8.3, Gerber and Green (2012)

Step 2: Based on the potential outcomes table, rules 1 and 2, estimate the quantities:

Table 2: Adapted from Table 8.2, Gerber and Green (2012)

Table 2: Adapted from Table 8.2, Gerber and Green (2012)

Accordingly, the direct effect estimate is:

\(\widehat{E[Y_{01} - Y_{00}]} = \frac{\frac{9}{0.2}}{\frac{1}{0.2}} - \frac{\frac{0}{0.6} + \frac{9}{0.4} + \frac{9}{0.8}}{\frac{1}{0.6} + \frac{1}{0.4} + \frac{1}{0.8}} = 9 - \frac{33.75}{5.4166} \approx 2.769\)

(Following rule 1(a) we exclude Village 3 from this calculation because it reports the spillover outcome \(Y_{10}\).)

And, the indirect effect estimate is:

\(\widehat{E[Y_{10} - Y_{00}]} = \frac{\frac{6}{0.4}}{\frac{1}{0.4}} - \frac{\frac{0}{0.6} + \frac{9}{0.4}}{\frac{1}{0.6} + \frac{1}{0.4}} = 6 - \frac{22.5}{4.166} \approx 0.599\)

(Following rule 1(b) we exclude Village 5 from this calculation because it has an undefined spillover outcome \(Y_{10}\).)

Hotspots Experiment and IPW

I will now use the hotspots experiment data (see Table 8.4 and 8.5), and estimate the direct effect \(E[Y_{01} - Y_{00}]\) for two subsets: (i) 11 hotspots that lie outside the spillover range (500 meters); and (ii) 19 hotspots that lie within the spillover range. Note that \(n = 30\) hotspots, \(m = 10\) (where treatment is additional police patrols), and \(Y\) is the number of crimes.

  • 11 hotspots outside the spillover range
data <- read.csv("W8_Hotspots.csv")

head(data)
##   hotspot hotwitin500 prob00 prob01 prob10 prob11 hotspotchk prox500
## 1       1           1  0.436  0.230   0.23  0.104          1       1
## 2       2           0  0.667  0.333   0.00  0.000          2       0
## 3       3           3  0.177  0.104   0.49  0.229          3       3
## 4       4           0  0.667  0.333   0.00  0.000          4       0
## 5       5           3  0.177  0.104   0.49  0.229          5       3
## 6       6           0  0.667  0.333   0.00  0.000          6       0
##   prox750 y00 y01 y10 y11 assignment exposure  y
## 1       2  30  25  35  23          0        0 30
## 2       0  10   5  15   3          0        0 10
## 3       5  60  55  65  53          0       10 65
## 4       0  10   5  15   3          1        1  5
## 5       6  70  65  75  63          0       10 75
## 6       0  10   5  15   3          0        0 10
# Step 1: Subset to hotspots unaffected by spillovers:

data.periphery <- data %>% filter(prox500 == 0)

dim(data.periphery)
## [1] 11 16
# Step 2: Estimate the true ATE

with(data.periphery, mean(y01 - y00))
## [1] -5
# Step 3: Estimate the ATE using IPW

data.periphery <- data.periphery %>% mutate(probs = ifelse(assignment == 0, 
    prob00, prob01), w = 1/probs)

model1 <- lm_robust(y ~ assignment, weights = w, data = data.periphery)

summary(model1)
## 
## Call:
## lm_robust(formula = y ~ assignment, data = data.periphery, weights = w)
## 
## Weighted, Standard error type =  HC2 
## 
## Coefficients:
##             Estimate Std. Error  Pr(>|t|) CI Lower CI Upper DF
## (Intercept)   11.667      1.667 6.325e-05    7.896    15.44  9
## assignment     3.333     10.138 7.498e-01  -19.600    26.27  9
## 
## Multiple R-squared:  0.01009 ,   Adjusted R-squared:  -0.0999 
## F-statistic: 0.09175 on 1 and 9 DF,  p-value: 0.7688
  • 19 hotspots that experience spillover:
# Step 1: Subset to hotspots affected by spillovers:

data.core <- data %>% filter(prox500 != 0)

dim(data.core)
## [1] 19 16
# Step 2: Recode the exposure variable:

data.core <- data.core %>% mutate(exposure = recode(exposure, `0` = "00", `10` = "10", 
    `1` = "01", `11` = "11"))

# Step 3: Estimate the true direct effect:

with(data.core, mean(y01 - y00))
## [1] -5
# Step 4: Estimate the ATE using IPW

data.core <- data.core %>% mutate(probs = ifelse(exposure == "00", prob00, ifelse(exposure == 
    "10", prob10, ifelse(exposure == "01", prob01, prob11))), w = 1/probs)

model2 <- lm_robust(y ~ exposure, weights = w, data = data.core, subset = exposure == 
    "01" | exposure == "00")

summary(model2)
## 
## Call:
## lm_robust(formula = y ~ exposure, data = data.core, weights = w, 
##     subset = exposure == "01" | exposure == "00")
## 
## Weighted, Standard error type =  HC2 
## 
## Coefficients:
##             Estimate Std. Error  Pr(>|t|) CI Lower CI Upper DF
## (Intercept)    62.61      5.150 2.585e-07    51.13   74.080 10
## exposure01    -16.03      8.716 9.568e-02   -35.45    3.387 10
## 
## Multiple R-squared:  0.2833 ,    Adjusted R-squared:  0.2116 
## F-statistic: 3.952 on 1 and 10 DF,  p-value: 0.07487