# Pearl’s Front Door Criterion

Let some treatment $$T$$ affect an outcome $$Y$$. Mediation is the study of mechanisms or the ways in which $$T$$ affects $$Y$$. For instance, we may think $$T$$ moves some intervening variable $$M$$, which then affects $$Y$$. Alternatively $$T$$ could move $$N$$ which in turn affects $$Y$$. There can also be a direct causal pathway connecting $$T$$ to $$Y$$.

Pearl says that we can retreive the effect of $$T$$ on $$Y$$ despite confounding by an unknown variable $$U$$. Call this the “front door method”. It involves:

Step 1: Estimate the effect of $$T$$ on $$M$$ and $$N$$. We can retrieve $$\beta_{T \rightarrow M}$$ and $$\beta_{T \rightarrow N}$$ because there is no confounding and all backdoor paths are blocked by collider $$Y$$.

Step 2: Estimate the effect of $$M$$ on $$Y$$ (and separately $$N$$ on $$Y$$). Again, we can retrieve $$\beta_{M \rightarrow Y}$$ and $$\beta_{N \rightarrow Y}$$ by blocking all backdoor paths if we condition on $$T$$. That is, by controlling for $$T$$, we block $$M \rightarrow T \rightarrow N \rightarrow Y$$ in the regression $$Y \sim M + T$$; and $$N \rightarrow T \rightarrow M \rightarrow Y$$ in the regression $$Y \sim N + T$$.

Step 3: Finally, get the total effect of $$T$$ on $$Y$$:

$$(\beta_{T \rightarrow M} \cdot \beta_{M \rightarrow Y}) + (\beta_{T \rightarrow N} \cdot \beta_{N \rightarrow Y})$$

Pearl’s estimation strategy invokes two assumptions:

1. Exhaustiveness: We must know all the causal pathways that connect $$T$$ to $$Y$$. That is: the conditioning variables ($$M$$ and $$N$$) intercept all directed paths from the causal variable $$T$$ to the outcome variable $$Y$$.

Intuition: Say we do not observe $$N$$, so $$T \rightarrow M \rightarrow Y$$ is not exhaustive. This is a problem because $$\beta_{T \rightarrow N}$$ and $$\beta_{N \rightarrow Y}$$ cannot be estimated; and thus the full causal effect of $$T$$ on $$Y$$ cannot be retrieved.

1. Isolation: The mechanisms ($$T \rightarrow M \rightarrow Y$$ and $$T \rightarrow N \rightarrow Y$$) should be “isolated” from all unblocked backdoor paths so that we can recover the full causal effect. This implies two things:

1. There are no unblocked back-door paths connecting $$T$$ and the mediators ($$M$$ and $$N$$)

2. All backdoor paths from the mediator ($$M$$ or $$N$$) to the outcome variable ($$Y$$) can be blocked by conditioning on the causal variable ($$T$$)

# Baron-Kenney equations

In regression terms, this roughly translates to estimating three models:

Model 1: $$M_i = \alpha_1 + aT_i + e_{1,i}$$ (equivalent to Step 1 in Pearl’s procedure but with one mediator)

Model 2: $$Y_i = \alpha_2 + cT_i + e_{2,i}$$ (for the total effect of $$T$$ on $$Y$$)

Model 3: $$Y_i = \alpha_3 + dT_i + b M_i + e_{3,i}$$ (equivalent to Step 2 in Pearl’s procedure where we condition on $$T$$ to get the effect of $$M$$ on $$Y$$).

Inputing equation 1 into 3 gives us:

$$Y_i = \alpha_3 + (d+ab)T_i + (\alpha_1 + e_{1,i})b + e_{3,i}$$

Where $$d$$ is the direct effect of $$T$$ on $$Y$$ (through paths other than $$M$$), and $$a\cdot b$$ is the indirect effect via $$M$$. The total effect $$c = d + (a \cdot b)$$.

## Critiques

1. $$E[a_i \cdot b_i] \neq E[a_i]\cdot E[b_i]$$

In both Pearl’s approach and the Baron-Kenney equations, we estimate the average effect of $$T$$ on $$M$$, and $$M$$ on $$Y$$ and multiply these coefficients to get the indirect effect. However, these effects might vary across units: for every $$i$$ $$a_i$$ captures the effect of $$T$$ on $$M$$, and $$b_i$$ the effect of $$M$$ on $$Y$$. For any given unit $$i$$ the indirect effect is:

$$a_i \times b_i$$

And averaging over all units, we get:

$$E[a_i b_i] = E[a_i] \cdot E[b_i] + Cov[a_i,b_i] = (a \cdot b) + Cov[a_i,b_i]$$

Where $$a$$ and $$b$$ are regression coefficients from models 1 and 3, which when multiplied do not give the indirect effect unless $$Cov[a_i,b_i]= 0$$ (which is only the case under constant treatment effects)

1. Model 3 provides biased estimates for several reasons. First, we condition on a post-treatment variable ($$M_i$$). Second, $$M_i$$ is non-randomly assigned and potentially related to unmeasured causes of $$Y_i$$. Formally: $$M_i \not\!\perp\!\!\!\perp e_{3,i}$$. Another way of saying this is that Pearl’s isolation condition is violated because there is an active backdoor path even after conditioning on $$T_i$$ : $$M_i \rightarrow e_{3,i} \rightarrow Y_i$$