Recap from Week I

  • Problem Set 1 has been graded and returned on Canvas. Please let me know if you have trouble viewing my comments.

  • Question 3 describes an encouragement-design with experimental features. The treatment was a simplified procedure to purchase a private water connection at 0% interest rate. A random subset of the sample received this “incentive”, while the remaining subset did not. This allows us to identify the causal effect of that incentive, and with some assumptions, the causal effect of possessing a private connection. In Chapters 5 and 6, we will discuss non-compliance and the necessary assumptions to identify complier average causal effects.

\(d_i\) and \(D_i\)

  • \(d_i\) is the observed treatment assignment of unit \(i\)

  • \(D_i\) is a random variable that indicates whether unit \(i\) would be treated in a hypothetical experiment.

In words: \(d_i\) is a particular realization of \(D_i\).

An example with \(n\)=4, \(m\)=2

Schedule of Potential Outcomes and Treatment Assignment
Unit Y1 Y0 d_i
1 10 15 1
2 8 13 1
3 6 11 0
4 4 9 0

What is \(E[Y_i(1) | d_i = 0]\)?

\(\frac{Y_3(1) + Y_4(1)}{2} = \frac{6+4}{2} = 5\)

How many ways can two of four units be assigned to treatment?

\({4 \choose 2} = \frac{4!}{2!\times 2!} = 6\) ways

Lets use randomizr to get the six different assignment vectors. The R package manual is available here.

# If you do not have this package:

# install.packages('randomizr')

# library(randomizr)

declaration <- declare_ra(N = 4, m = 2)  #this gives randomizr the necessary information
declaration
## Random assignment procedure: Complete random assignment 
## Number of units: 4 
## Number of treatment arms: 2 
## The possible treatment categories are 0 and 1.
## The probabilities of assignment are constant across units.
# If you are doing simple random assignment
declare_ra(N = 4, simple = TRUE)
## Random assignment procedure: Simple random assignment 
## Number of units: 4 
## Number of treatment arms: 2 
## The possible treatment categories are 0 and 1.
## The probabilities of assignment are constant across units.
conduct_ra(declaration)  # This uses that information to generate an assignment vector. This vector will be different every time we run the command. See below
## [1] 0 1 0 1
Z <- conduct_ra(declaration)
Z  # Different from the first assignment vector
## [1] 1 1 0 0
obtain_condition_probabilities(declaration, Z)  # This gives us each unit's probability of being assigned to treatment. When there are no blocks or clusters, this will be the same for all units.
## [1] 0.5 0.5 0.5 0.5
D <- obtain_permutation_matrix(declaration)
# Generates all possible assignment vectors for N=4, m=2
dim(D)
## [1] 4 6
print(D)
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    0    0    1    1    1
## [2,]    0    1    1    0    0    1
## [3,]    1    0    1    0    1    0
## [4,]    1    1    0    1    0    0

For any unit \(i \in \{1,2,3,4\}\), \(D_i = 1\) with probability \(\frac{1}{2}\) and 0 with probability \(\frac{1}{2}\).

We can manually check this in table D. Unit 1 is assigned to treatment in three of six assignment vectors (columns 4, 5, and 6). Unit 4 is assigned three of six times to treatment as well (columns 1,2, and 4).

Now lets see the full science table:

Full Schedule of Potential Outcomes and Treatment Assignments
Unit Y1 Y0 d_i d1 d2 d3 d4 d5
1 10 15 1 0 0 0 1 1
2 8 13 1 0 1 1 0 0
3 6 11 0 1 0 1 0 1
4 4 9 0 1 1 0 1 0

What is the random assignment assumption?

Formally, we have:

\(Y_i(1), Y_i(0),X \perp\!\!\!\perp D_i\)

Full Schedule of Potential Outcomes and Treatment Assignments
Unit Y1 Y0 d_i d1 d2 d3 d4 d5
1 10 15 1 0 0 0 1 1
2 8 13 1 0 1 1 0 0
3 6 11 0 1 0 1 0 1
4 4 9 0 1 1 0 1 0
r(Y1,d) NA NA 0.89 -0.89 -0.45 0 0 0.45
r(Y0,d) NA NA 0.89 -0.89 -0.45 0 0 0.45

What might be average value of \(r(Y_i(1),d)\) and \(r(Y_i(0),d)\)?

Answer: Zero! Which is an implied property of random assignment.

Can you provide an intuition for why the correlation between any given assignment vector and treated/untreated potential outcomes is the same?

Answer: This schedule uses a constant treatment effect, so \(Y1 = Y0 - 5\). This implies \(Var[Y1] = Var[Y0 - 5] = Var[Y0]\), and that \(Cov(Y1,d_i) = Cov(Y0-5,d_i) = Cov(Y0,d_i)\)

Finally, what is \(E[Y_i(1) | D_i = 0]\)?

Gerber and Green say:

“The notation \(E[Y_i(1) | D_i = 0]\) may be regarded as shorthand for \(E[E[Y_i(1)|d_i = 0, \textbf{d}]]\), where d refers to a vector of treatment assignments and \(d_i\) refers its \(i\)th element. Given d, we may calculate the probability distribution function for all \(\{Y(1),d\}\) pairs and the expectation given this set of assignments. Then we may take the expectation of this expected value by summing over all possible d vectors.” (P28, Footnote 3)

\(E[E[Y_i(1)|d_i = 0, \textbf{d}]]\) = \((\frac{1}{6})(\frac{6+4}{2}) + (\frac{1}{6})(\frac{10+8}{2}) + (\frac{1}{6})(\frac{10+6}{2}) + (\frac{1}{6})(\frac{10+4}{2}) + (\frac{1}{6})(\frac{8+6}{2}) + (\frac{1}{6})(\frac{8+4}{2})\)

Which equals: \(\frac{1}{6} \times \frac{84}{2} = 7\)

Unsurprisingly, \(E[Y_i(1)] = \frac{10+8+6+4}{4} = 7\)

This is what random assignment suggests: \(E[Y_i(1) | D_i = 0] = E[Y_i(1) | D_i = 1] = E[Y_i(1)]\)

Caveat: Correlations tell us whether there is a linear association between two variables. By itself, it is not evidence of statistical independence because two variables might be non-linearly associated.

Sampling Distribution of ATE

(For each assignment vector, manually calculate the ATE (or ATT) using the table above. Then compare with code-generated results below)

library(dplyr)
data1 <- data1 %>% slice(1:4)  # remove the correlation rows from the dataset
data1 <- data1 %>% select(Unit, Y1, Y0, d_i)  # to remove the permutation matrix D.
data1
## # A tibble: 4 x 4
##    Unit    Y1    Y0   d_i
##   <chr> <chr> <chr> <chr>
## 1     1    10    15     1
## 2     2     8    13     1
## 3     3     6    11     0
## 4     4     4     9     0
## 'Loop and Function' approach to get the sampling distribution of the ATE

# Step 1: Define a function that computes the difference-in-means, given a
# treatment assignment vector Z
dif_in_means <- function(Y1, Y0, Z) {
    # We need to ensure the dataset columns are numerical vectors, not
    # characters
    as.num <- function(x) {
        as.numeric(as.character(x))
    }
    Y1 <- as.num(Y1)
    Y0 <- as.num(Y0)
    Z <- as.num(Z)
    # Switching equation
    Y <- Y1 * Z + (1 - Z) * Y0
    # ATE estimate
    estimate <- mean(Y[Z == 1], na.rm = T) - mean(Y[Z == 0], na.rm = T)
    return(estimate)
}

# Step 2: Create an output vector, and run a loop with this function

ate <- rep(NA, 6)  # We will have one ATE estimate for each assignment vector, hence 6 slots
# another way to do this is:
reps <- dim(D)[2]  # gives the number of columns in the permutation matrix D
ate <- rep(0, reps)

for (i in 1:reps) {
    Z = D[, i]
    ate[i] <- dif_in_means(data1$Y1, data1$Y0, Z)
}

print(ate)
## [1] -9 -7 -5 -5 -3 -1
mean(ate)  # Should be -5
## [1] -5
hist(ate)  # Sampling distribution of ATE estimate