Within-subject experiments track a single person or entity over a period of time, and random assignment determines when a treatment is administered (Gerber and Green 2012:273). Such designs typically make two non-interference assumptions: no anticipation (\(D_{i,t+1}\) does not affect \(Y_{i,t}\)) and no persistence (\(D_{i,t-1}\) does not affect \(Y_{i,t}\)). In this section, I use the “tetris” dataset to evaluate these assumptions (this corresponds to Question 10 (b) and (c) in the previous week’s problem set):
library(tidyverse)
library(randomizr)
library(estimatr)
library(knitr)
# Download the dataset
data <- foreign::read.dta("W8_Tetris.dta")
# We need to create two variables:
data$run_lag <- c(NA, data$run[1:25])
data$run_anticp <- c(data$run[2:26], NA)
# Estimate the effects:
effects <- data.frame(Bivariate = c(lm_robust(tetris ~ run, data = data)$coefficients[2],
"(Coefficient from tetris ~ run )"), Persistence = c(summary(lm_robust(tetris ~
run + run_lag, data = data))$fstatistic[1], "(F Statistic from tetris ~ run + run_lag)"),
Anticipation = c(lm_robust(tetris ~ run_anticp, data = data)$coefficients[2],
"(Coefficient from tetris ~ run_anticp)"))
kable(effects, row.names = F, caption = "Estimates", digits = 2)
Bivariate | Persistence | Anticipation |
---|---|---|
13613.1 | 4.5445923571162 | 645.621212121212 |
(Coefficient from tetris ~ run ) | (F Statistic from tetris ~ run + run_lag) | (Coefficient from tetris ~ run_anticp) |
# Conduct randomization inference:
## Note that every time we generate a new assignment vector ('run'), the
## lagged and future values also change. So we need to write a loop for
## randomization inference.
## Step 1: Declare design, define output vectors
set.seed(343)
declaration <- declare_ra(N = 26, prob = 0.5, simple = T)
perms <- obtain_permutation_matrix(declaration)
dim(perms)
## [1] 26 10000
bivariate.out <- rep(NA, 10000)
persistence.out <- rep(NA, 10000)
anticipation.out <- rep(NA, 10000)
# Step 2: Run a loop, estimating the quantities for each assignment vector
# from 'perms'
for (i in 1:10000) {
# Define variables
data$Z <- perms[, i]
data$Z_lag <- c(NA, data$Z[1:25])
data$Z_anticp <- c(data$Z[2:26], NA)
# Store output
bivariate.out[i] <- lm_robust(tetris ~ Z, data = data)$coefficients[2]
persistence.out[i] <- summary(lm_robust(tetris ~ Z + Z_lag, data = data))$fstatistic[1]
anticipation.out[i] <- lm_robust(tetris ~ Z_anticp, data = data)$coefficients[2]
}
# For RI p value in the regression tetris ~ run
mean(abs(bivariate.out) >= 13613.1)
## [1] 0.0109
# For RI p value on the F statistic in tetris ~ run + run_lag
mean(abs(persistence.out) >= 4.545)
## [1] 0.019
# For RI p value in the regression tetris ~ run_anticp
mean(abs(anticipation.out) >= 645.621)
## [1] 0.8994
This section focuses on the variability in treatment effects or \(Var[\tau_i]\). I will begin with a discussion on how to detect heterogeneous treatment effects. Mainly, we either place bounds on \(Var[\tau_i]\), or do a hypothesis test (\(\widehat{Var[Y_i(1)]} = \widehat{Var[Y_i(0)]}\)) under the constant effects assumption. After this, I will show some regression-based strategies to model heterogeneity: first when there is interaction between treatment and covariates; and then between treatments.
Intuition: We can never point-estimate \(Var[\tau_i]\) because \(Cov[Y_i(1),Y_i(0)]\) is never known. However, we can place bounds on this quantity by estimating the minimum and maximum covariance between treated and untreated potential outcomes.
To see this, note that:
\(Var[\tau_i] = Var[Y_i(1) - Y_i(0)] = Var[Y_i(1)] + Var[Y_i(0)] - 2\cdot Cov[Y_i(1),Y_i(0)]\)
We can use the observations in the treatment group to get \(\widehat{Var[Y_i(1)]}\); and similarly use the control group units to get \(\widehat{Var[Y_i(0)]}\). This leaves the covariance term:
Then we have two bounds:
Upper bound: \(\widehat{Var[Y_i(1)]} + \widehat{Var[Y_i(0)]} - 2\cdot \widehat{Cov_{min}[Y_i(1),Y_i(0)]}\)
Lower bound: \(\widehat{Var[Y_i(1)]} + \widehat{Var[Y_i(0)]} - 2\cdot \widehat{Cov_{max}[Y_i(1),Y_i(0)]}\)
Here is the code to do this in R
library(fabricatr)
library(lmtest)
library(ri2)
# Setup: Create some data using 'fabricatr'
set.seed(46)
data <- fabricate(N = 1000, X1 = rnorm(N), X2 = rnorm(N), Y0 = rnorm(N, mean = 0,
sd = 4) + X1 + X2, Y1 = Y0 + 5, Z = complete_ra(N), Y = Y0 * (1 - Z) + Y1 *
Z)
kable(head(data), caption = "Example Dataset")
ID | X1 | X2 | Y0 | Y1 | Z | Y |
---|---|---|---|---|---|---|
0001 | -0.8989165 | 1.1591073 | -0.5525463 | 4.447454 | 0 | -0.5525463 |
0002 | 0.2121311 | 0.5222966 | -3.7698224 | 1.230178 | 1 | 1.2301776 |
0003 | -0.7285901 | 1.0230716 | 0.1487505 | 5.148751 | 0 | 0.1487505 |
0004 | 1.2355204 | -0.0198303 | 3.1718613 | 8.171861 | 1 | 8.1718613 |
0005 | 1.1688400 | 0.5450542 | 2.5888604 | 7.588860 | 1 | 7.5888604 |
0006 | -0.6224851 | -1.3212777 | -3.1347933 | 1.865207 | 0 | -3.1347933 |
# Step 0: True Variance in Potential Outcomes (Var[Y1] = Var[Y0] given this
# data generating process)
true.var <- data %>% summarise(VarY0 = var(Y0), VarY1 = var(Y1))
kable(true.var)
VarY0 | VarY1 |
---|---|
17.89246 | 17.89246 |
# Step 1: Estimate Var[Y0] and Var[Y1]
var.estimates <- data %>% group_by(Z) %>% summarise(Variance = var(Y))
kable(var.estimates)
Z | Variance |
---|---|
0 | 17.78453 |
1 | 17.94717 |
var.dif <- var.estimates$Variance[2] - var.estimates$Variance[1]
var.dif
## [1] 0.1626368
# Step 2: Estimate the minimum and maximum values of Cov[Y0,Y1]
control.group <- data %>% filter(Z == 0) %>% arrange(Y)
treat.group.hightolow <- data %>% filter(Z == 1) %>% arrange(desc(Y))
treat.group.lowtohigh <- data %>% filter(Z == 1) %>% arrange(Y)
cov.max <- cov(control.group$Y, treat.group.lowtohigh$Y)
cov.max
## [1] 17.79711
cov.min <- cov(control.group$Y, treat.group.hightolow$Y)
cov.min
## [1] -17.82733
# Step 3: Estimate the upper and lower bound for Var[tau]
var.lower <- var.estimates$Variance[2] + var.estimates$Variance[1] - 2 * cov.max
var.lower
## [1] 0.1374816
var.upper <- var.estimates$Variance[2] + var.estimates$Variance[1] - 2 * cov.min
var.upper
## [1] 71.38636
So we end up with an interval estimate: \(Var[\tau_i] \in [0.1374816, 71.38636]\). Note that \(Var[\tau_i] \geq 0\), so the minimum value the lower bound can take is 0.
A second way to test for heterogeneity is to evaluate the hypothesis \(H_a: \widehat{Var[\tau_i]} \neq 0\). As I will show below, rejecting the hypothesis \(Var[Y_i(1)] = Var[Y_i(0)]\) also means rejecting the hypothesis \(Var[\tau_i] = 0\). To see this, note that:
\(Var[Y_i(1)] = Var[Y_i(0) + \tau_i] = Var[Y_i(0)] + Var[\tau_i] + 2\cdot Cov[Y_i(0),\tau_i]\)
Under the constant effects assumption, \(Var[\tau_i] = 0\) and:
\(2 \cdot Cov[Y_i(0), \tau_i] = 0\) because \(\tau_i = \tau \forall i\) (i.e. \(\tau_i\) is a constant, and the covariance between a random variable and a constant is zero).
This means under the null hypothesis of constant treatment effects:
\(Var[Y_i(1)] = Var[Y_i(0)]\) and \(Var[Y_i(1)] - Var[Y_i(0)] = 0\)
The strategy then is straightforward:
Step 1: Estimate the difference in variances or \(\widehat{Var[Y_i(1)]} - \widehat{Var[Y_i(0)]}\).
Step 2: Assuming constant effects, construct a potential outcomes table. Under this null hypothesis: \(Var[\tau_i] = 0\) and \(Var[Y_i(1)] = Var[Y_i(0)]\)
Step 3: Conduct randomization inference to get the \(p\) value for the observed difference-in-variance. In other words, the probability of obtaining the observed value or more extreme, given the null hypothesis is correct.
If \(p \geq \alpha\) (some arbitrarily selected critical value), we cannot reject the null hypothesis \(Var[\tau_i] = 0\) and \(Var[Y_i(1)] = Var[Y_i(0)]\). This means we cannot rule out constant treatment effects (or lack of heterogeneity).
Here is R code to perform this operation:
# Step 1: Estimate the difference in variances: Var[Y1] - Var[Y0]
var.dif
## [1] 0.1626368
# Step 2: Create a schedule of potential outcomes under the constant
# treatment effects assumption
ate_hat <- lm_robust(Y ~ Z, data = data)$coefficients[2]
ate_hat
## Z
## 5.421689
ri.data <- data %>% select(Y, Z) %>% mutate(Y0_hyp = Y - ate_hat * Z, Y1_hyp = Y0_hyp +
ate_hat)
kable(head(ri.data))
Y | Z | Y0_hyp | Y1_hyp |
---|---|---|---|
-0.5525463 | 0 | -0.5525463 | 4.869143 |
1.2301776 | 1 | -4.1915118 | 1.230178 |
0.1487505 | 0 | 0.1487505 | 5.570440 |
8.1718613 | 1 | 2.7501719 | 8.171861 |
7.5888604 | 1 | 2.1671710 | 7.588860 |
-3.1347933 | 0 | -3.1347933 | 2.286896 |
# Step 3A: Specify the random assignment
declaration <- declare_ra(N = 1000, m = 500)
# Step 3B: Write a function that estimates difference in variance
dif_in_var <- function(data) {
output <- var(data$Y1_hyp[data$Z == 1]) - var(data$Y0_hyp[data$Z == 0])
return(output)
}
dif_in_var(ri.data)
## [1] 0.1626368
# Step 3C: Conduct randomization inference using 'ri2'
ri.out1 <- conduct_ri(test_function = dif_in_var, assignment = "Z", declaration = declaration,
data = ri.data, sims = 10000)
summary(ri.out1)
## coefficient estimate two_tailed_p_value null_ci_lower
## 1 Custom Test Statistic 0.1626368 0.9221 -3.16951
## null_ci_upper
## 1 3.214688
plot(ri.out1)
As the hypothesis test reveals, \(p = 0.9221\) which suggests a high level of congruence between the null hypothesis (\(Var[\tau_i] = 0\) / \(Var[Y_i(1)] = Var[Y_i(0)]\)) and the observed test statistic (\(\widehat{Var[Y_i(1)]} - \widehat{Var[Y_i(0)]}\)). We fail to reject the null of constant treatment effects. This is in line with our expectations, given the data generating process specified for this example.
In this part, I discuss regression-based strategies to model treatment effect heterogeneity. I begin with treatment-by-covariate interactions, then move to treatment-by-treatment interactions.
Consider the dataset from the previous section (\(n=1000\), \(m=500\)). Say we are interested in knowing if treatment effects vary by covariate profile (i.e. different levels of \(X_1\) and \(X_2\)). How might we model and estimate that heterogeneity?
Intution: Interaction terms in an OLS regression capture heterogeneity in treatment effects. To see this, consider two models:
Restricted Model: \(Y_i = \alpha_1 + \beta_{1} Z_i + \gamma_{11} X_{1,i} + \gamma_{12} X_{2,i} + u_i\)
Unrestricted Model: \(Y_i = \alpha_2 + \beta_{2} Z_i + \gamma_{21} X_{1,i} + \gamma_{22} X_{2,i} + \gamma_{23} (Z_i \cdot X_{1,i}) + \gamma_{24} (Z_i \cdot X_{2,i}) + \epsilon_i\)
We want to know if \(\gamma_{23} = \gamma_{24} = 0\), or in words: there is no heterogeneity in treatment effects at different levels of \(X_1\) and \(X_2\). One way to do this is to estimate the F-statistic (joint significance of the interaction terms), and then do randomization inference to see how likely we are to observe that F-statistic value (or more extreme) if there was no heterogeneity in treatment effects (i.e. under the null hypothesis of constant effects). Here are the steps to do that:
Step 1: Estimate the F-statistic
where \(k\) is the number of parameters in the unrestricted model (6), and \(q\) the number of parameters in the restricted model (4). Put another way, \(k-q\) captures the number of additional coefficients whose statistical significance is being evaluated (two interaction terms in this case).
Step 2: Generate a full schedule of potential outcomes using \(\hat{\beta}\) from the restricted (“base”) model. This schedule is generated under the constant effects assumption.
Step 3: Conduct randomization inference to get the distribution of F statistics under the null hypothesis of constant treatment effects. We can then estimate a \(p\) value that gives the probability of observing our F-statistic (or more extreme), under the null hypothesis of constant treatment effects.
Here is the R code to perform this operation:
# Step 0: Estimate models
model1 <- lm(Y ~ Z + X1 + X2, data = data)
summary(model1)
##
## Call:
## lm(formula = Y ~ Z + X1 + X2, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.6128 -2.5867 -0.0092 2.6175 12.3267
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.1351 0.1818 -0.743 0.458
## Z 5.3212 0.2571 20.693 < 2e-16 ***
## X1 0.8460 0.1253 6.751 2.49e-11 ***
## X2 0.8226 0.1325 6.207 7.92e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.061 on 996 degrees of freedom
## Multiple R-squared: 0.3475, Adjusted R-squared: 0.3456
## F-statistic: 176.8 on 3 and 996 DF, p-value: < 2.2e-16
model2 <- lm(Y ~ Z + X1 + X2 + Z * X1 + Z * X2, data = data)
summary(model2)
##
## Call:
## lm(formula = Y ~ Z + X1 + X2 + Z * X1 + Z * X2, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.3455 -2.6071 0.0317 2.5405 12.4775
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.1349 0.1819 -0.742 0.458459
## Z 5.3075 0.2574 20.623 < 2e-16 ***
## X1 0.7518 0.1740 4.321 1.71e-05 ***
## X2 0.6438 0.1872 3.439 0.000609 ***
## Z:X1 0.1908 0.2508 0.761 0.446848
## Z:X2 0.3553 0.2650 1.341 0.180348
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.061 on 994 degrees of freedom
## Multiple R-squared: 0.3491, Adjusted R-squared: 0.3458
## F-statistic: 106.6 on 5 and 994 DF, p-value: < 2.2e-16
# Step 1: Perform an F-test (joint significance of interaction terms Z*X1
# and Z*X2)
test.out <- lmtest::waldtest(model1, model2, test = "F")
test.out
## Wald test
##
## Model 1: Y ~ Z + X1 + X2
## Model 2: Y ~ Z + X1 + X2 + Z * X1 + Z * X2
## Res.Df Df F Pr(>F)
## 1 996
## 2 994 2 1.1993 0.3018
# Step 2: Randomization Inference to get p value
## Declare the randomization procedure
declaration <- declare_ra(N = 1000, m = 500)
## Create potential outcomes table under the constant treatment effects
## assumption
ate_hat <- model1$coefficients["Z"]
ri.data <- data %>% select(Y, Z, X1, X2) %>% mutate(Y0_hyp = Y - ate_hat * Z,
Y1_hyp = Y0_hyp + ate_hat)
## Write a function for 'ri2' that computes F statistic
ftest <- function(data) {
data$Y_sim <- data$Y0_hyp
data$Y_sim[data$Z == 1] <- data$Y1_hyp[data$Z == 1]
model1 <- lm(Y_sim ~ Z + X1 + X2, data = data)
model2 <- lm(Y_sim ~ Z + X1 + X2 + Z * X1 + Z * X2, data = data)
test.out <- lmtest::waldtest(model1, model2, test = "F")
test.out$F[2]
}
# Step 3: Randomization inference on ri2
ri.out2 <- conduct_ri(test_function = ftest, assignment = "Z", declaration = declaration,
data = ri.data, sims = 10000)
summary(ri.out2)
## coefficient estimate two_tailed_p_value null_ci_lower
## 1 Custom Test Statistic 1.199303 0.3184 0.02875406
## null_ci_upper
## 1 3.871089
plot(ri.out2)
Finally, consider a factorial design in which there are two treatments: \(Z_1 \in \{0,1\}\) and \(Z_2 \in \{0,1\}\). We are interested in how \(Z_1\)’s effect varies at different levels of \(Z_2\) and vice versa. To do this, consider the following models:
Restricted Model: \(Y_i = \alpha_1 + \beta_{11} Z_1 + \beta_{12} Z_2 + u_i\)
Unrestricted Model: \(Y_i = \alpha_2 + \beta_{21} Z_1 + \beta_{22} Z_2 + \beta_{23} (Z_1 \cdot Z_2) + \epsilon_i\)
We are interested in the interaction term \(\beta_{23}\) because it captures the relevant heterogeneity. Visually speaking:
\(\beta_{23} = (d - b) - (c - a)\)
Alternatively: \(\beta_{23} = (d - c) - (b - a)\)
To do randomization inference on \(\beta_{23}\), we follow these steps:
Step 1: Estimate the restricted and unrestricted model.
Step 2: Estimate the F statistic, call this f.obs.
Step 3: Use the restricted (“base”) model to construct a full schedule of potential outcomes under the constant and additive effects assumption. Note that each observation has four potential outcomes, \(Y_i(Z_1,Z_2)\): \(Y_{00}\), \(Y_{01}\), \(Y_{10}\), \(Y_{11}\). So we must use \(\widehat{\beta_{11}}\) and \(\widehat{\beta_{12}}\) to generate potential outcomes.
Step 4: Declare the factorial design, and obtain the permutation matrix for randomization inference. (Think of this as a four-arm trial in which units can be assigned with equal probability to “00”,“01”,“10”,“11”).
Step 5: For each iteration of the loop, call Y_sim the revealed potential outcome of unit \(i\), which depends on \(i\)’s assignment status. Use Y_sim to estimate the two models, and the F-statistic. Repeat this process a large number of times to obtain a distribution of F-statistics under the null hypothesis of constant effects. Call this vector f.stats
Step 6: Obtain a \(p\) value by calculating mean(abs(f.stats) >= f.obs). This is the probability of observing f.obs or more extreme under the null hypothesis of constant effects (i.e. no heterogeneity across units or treatment conditions). If \(p \leq \alpha\), we can reject the null hypothesis. Put another way, the interaction term is statistically significant.
Practice: Lets attempt this by looking at the Rind and Bordia (1996) dataset from Question 6 in this week’s problem set.
data <- foreign::read.dta("W9_RindBordia.dta")
# Step 0: Create treatment indicators
data <- data %>% mutate(Sex = as.numeric(female == "Female"), Face = as.numeric(happyface ==
"Happy Face"))
head(data)
## female happyface tip xhappy tipround Sex Face
## 1 Female None 44.78 0 45 1 0
## 2 Female None 27.55 0 28 1 0
## 3 Female None 18.12 0 18 1 0
## 4 Male None 0.00 0 0 0 0
## 5 Female None 30.08 0 30 1 0
## 6 Male None 16.28 0 16 0 0
# Step 1: Estimate the two models, and joint significance of interaction
# term
model1 <- lm(tip ~ Sex + Face, data = data)
model2 <- lm(tip ~ Sex + Face + Sex * Face, data = data)
# Step 2: Estimate the F-statistic (significance of the interaction term)
test.out <- lmtest::waldtest(model1, model2, test = "F")
test.out # F= 3.9944
## Wald test
##
## Model 1: tip ~ Sex + Face
## Model 2: tip ~ Sex + Face + Sex * Face
## Res.Df Df F Pr(>F)
## 1 86
## 2 85 1 3.9944 0.04885 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Step 3: Construct schedule of potential outcomes under the constant
# effects assumption
ate_hat_sex <- model1$coefficients["Sex"]
ate_hat_face <- model1$coefficients["Face"]
ri.data <- data %>% mutate(Y00 = tip - ate_hat_face * Face - ate_hat_sex * Sex,
Y10 = Y00 + ate_hat_face * Face, Y01 = Y00 + ate_hat_sex * Sex, Y11 = Y00 +
ate_hat_face * Face + ate_hat_sex * Sex)
# Step 4: Declare randomization procedure, obtain permutation matrix
declaration = declare_ra(N = nrow(data), prob_each = c(0.25, 0.25, 0.25, 0.25),
condition_names = c("00", "10", "01", "11"))
perm <- obtain_permutation_matrix(declaration)
dim(perm)
## [1] 89 10000
Task: In pairs, write a loop for randomization inference and calculate the two-sided p value for f.obs (observed F statistic). Here is what I get:
# Step 5: Randomization inference using a loop
f.out <- rep(NA, 10000)
for (i in 1:ncol(perm)) {
# Realized assignment status
ri.data <- within(ri.data, {
Z_sim <- perm[, i]
Face_sim <- as.numeric(Z_sim == "10" | Z_sim == "11")
Sex_sim <- as.numeric(Z_sim == "01" | Z_sim == "11")
})
# Realized outcome values
ri.data <- within(ri.data, {
Y_sim <- Y00
Y_sim[Face_sim == 0 & Sex_sim == 1] <- Y01[Face_sim == 0 & Sex_sim ==
1]
Y_sim[Face_sim == 1 & Sex_sim == 0] <- Y10[Face_sim == 1 & Sex_sim ==
0]
Y_sim[Face_sim == 1 & Sex_sim == 1] <- Y11[Face_sim == 1 & Sex_sim ==
1]
})
# Estimate models
m1 <- lm(Y_sim ~ Face_sim + Sex_sim, data = ri.data)
m2 <- lm(Y_sim ~ Face_sim + Sex_sim + Face_sim * Sex_sim, data = ri.data)
# Report F statistic
f.out[i] <- lmtest::waldtest(m1, m2, test = "F")$F[2]
}
# Step 6: P Value and Graphic
mean(abs(f.out) >= test.out$F[2])
## [1] 0.0474
fig1 <- ggplot2::ggplot(data.frame(f.out), aes(x = f.out)) + geom_histogram() +
geom_vline(xintercept = test.out$F[2]) + theme_bw() + xlab("F Statistic Distribution Under the Null")
fig1