Within-subject experiments track a single person or entity over a period of time, and random assignment determines *when* a treatment is administered (Gerber and Green 2012:273). Such designs typically make two non-interference assumptions: no anticipation (\(D_{i,t+1}\) does not affect \(Y_{i,t}\)) and no persistence (\(D_{i,t-1}\) does not affect \(Y_{i,t}\)). In this section, I use the “tetris” dataset to evaluate these assumptions (this corresponds to Question 10 (b) and (c) in the previous week’s problem set):

```
library(tidyverse)
library(randomizr)
library(estimatr)
library(knitr)
# Download the dataset
data <- foreign::read.dta("W8_Tetris.dta")
# We need to create two variables:
data$run_lag <- c(NA, data$run[1:25])
data$run_anticp <- c(data$run[2:26], NA)
# Estimate the effects:
effects <- data.frame(Bivariate = c(lm_robust(tetris ~ run, data = data)$coefficients[2],
"(Coefficient from tetris ~ run )"), Persistence = c(summary(lm_robust(tetris ~
run + run_lag, data = data))$fstatistic[1], "(F Statistic from tetris ~ run + run_lag)"),
Anticipation = c(lm_robust(tetris ~ run_anticp, data = data)$coefficients[2],
"(Coefficient from tetris ~ run_anticp)"))
kable(effects, row.names = F, caption = "Estimates", digits = 2)
```

Bivariate | Persistence | Anticipation |
---|---|---|

13613.1 | 4.5445923571162 | 645.621212121212 |

(Coefficient from tetris ~ run ) | (F Statistic from tetris ~ run + run_lag) | (Coefficient from tetris ~ run_anticp) |

```
# Conduct randomization inference:
## Note that every time we generate a new assignment vector ('run'), the
## lagged and future values also change. So we need to write a loop for
## randomization inference.
## Step 1: Declare design, define output vectors
set.seed(343)
declaration <- declare_ra(N = 26, prob = 0.5, simple = T)
perms <- obtain_permutation_matrix(declaration)
dim(perms)
```

`## [1] 26 10000`

```
bivariate.out <- rep(NA, 10000)
persistence.out <- rep(NA, 10000)
anticipation.out <- rep(NA, 10000)
# Step 2: Run a loop, estimating the quantities for each assignment vector
# from 'perms'
for (i in 1:10000) {
# Define variables
data$Z <- perms[, i]
data$Z_lag <- c(NA, data$Z[1:25])
data$Z_anticp <- c(data$Z[2:26], NA)
# Store output
bivariate.out[i] <- lm_robust(tetris ~ Z, data = data)$coefficients[2]
persistence.out[i] <- summary(lm_robust(tetris ~ Z + Z_lag, data = data))$fstatistic[1]
anticipation.out[i] <- lm_robust(tetris ~ Z_anticp, data = data)$coefficients[2]
}
# For RI p value in the regression tetris ~ run
mean(abs(bivariate.out) >= 13613.1)
```

`## [1] 0.0109`

```
# For RI p value on the F statistic in tetris ~ run + run_lag
mean(abs(persistence.out) >= 4.545)
```

`## [1] 0.019`

```
# For RI p value in the regression tetris ~ run_anticp
mean(abs(anticipation.out) >= 645.621)
```

`## [1] 0.8994`

This section focuses on the *variability in treatment effects* or \(Var[\tau_i]\). I will begin with a discussion on how to detect heterogeneous treatment effects. Mainly, we either place bounds on \(Var[\tau_i]\), or do a hypothesis test (\(\widehat{Var[Y_i(1)]} = \widehat{Var[Y_i(0)]}\)) under the constant effects assumption. After this, I will show some regression-based strategies to model heterogeneity: first when there is interaction between treatment and covariates; and then between treatments.

**Intuition:** We can never point-estimate \(Var[\tau_i]\) because \(Cov[Y_i(1),Y_i(0)]\) is never known. However, we can place bounds on this quantity by estimating the minimum and maximum covariance between treated and untreated potential outcomes.

To see this, note that:

\(Var[\tau_i] = Var[Y_i(1) - Y_i(0)] = Var[Y_i(1)] + Var[Y_i(0)] - 2\cdot Cov[Y_i(1),Y_i(0)]\)

We can use the observations in the treatment group to get \(\widehat{Var[Y_i(1)]}\); and similarly use the control group units to get \(\widehat{Var[Y_i(0)]}\). This leaves the covariance term:

- \(\widehat{Cov[Y_i(1),Y_i(0)]}\) is largest if we pair the smallest \(Y_i(1)\) values with the smallest \(Y_i(0)\) values, and the largest \(Y_i(1)\) values with the largest \(Y_i(0)\) values. Assuming there are equal number of observations in treatment in control group, this means estimating the covariance between outcomes, when they are both arranged in ascending order. Call this \(\widehat{Cov_{max}[Y_i(1),Y_i(0)]}\)