Let some treatment \(T\) affect an outcome \(Y\). Mediation is the study of mechanisms or the ways in which \(T\) affects \(Y\). For instance, we may think \(T\) moves some intervening variable \(M\), which then affects \(Y\). Alternatively \(T\) could move \(N\) which in turn affects \(Y\). There can also be a direct causal pathway connecting \(T\) to \(Y\).

Pearl says that we can retreive the effect of \(T\) on \(Y\) despite confounding by an unknown variable \(U\). Call this the “front door method”. It involves:

**Step 1:** Estimate the effect of \(T\) on \(M\) and \(N\). We can retrieve \(\beta_{T \rightarrow M}\) and \(\beta_{T \rightarrow N}\) because there is no confounding and all backdoor paths are blocked by collider \(Y\).

**Step 2:** Estimate the effect of \(M\) on \(Y\) (and separately \(N\) on \(Y\)). Again, we can retrieve \(\beta_{M \rightarrow Y}\) and \(\beta_{N \rightarrow Y}\) by blocking all backdoor paths if we condition on \(T\). That is, by controlling for \(T\), we block \(M \rightarrow T \rightarrow N \rightarrow Y\) in the regression \(Y \sim M + T\); and \(N \rightarrow T \rightarrow M \rightarrow Y\) in the regression \(Y \sim N + T\).

**Step 3:** Finally, get the total effect of \(T\) on \(Y\):

\((\beta_{T \rightarrow M} \cdot \beta_{M \rightarrow Y}) + (\beta_{T \rightarrow N} \cdot \beta_{N \rightarrow Y})\)

Pearl’s estimation strategy invokes two assumptions:

**Exhaustiveness:**We must know all the causal pathways that connect \(T\) to \(Y\). That is: the conditioning variables (\(M\) and \(N\)) intercept all directed paths from the causal variable \(T\) to the outcome variable \(Y\).

*Intuition:* Say we do not observe \(N\), so \(T \rightarrow M \rightarrow Y\) is not exhaustive. This is a problem because \(\beta_{T \rightarrow N}\) and \(\beta_{N \rightarrow Y}\) cannot be estimated; and thus the full causal effect of \(T\) on \(Y\) cannot be retrieved.

**Isolation**: The mechanisms (\(T \rightarrow M \rightarrow Y\) and \(T \rightarrow N \rightarrow Y\)) should be “isolated” from all unblocked backdoor paths so that we can recover the full causal effect. This implies two things:There are no unblocked back-door paths connecting \(T\) and the mediators (\(M\) and \(N\))

All backdoor paths from the mediator (\(M\) or \(N\)) to the outcome variable (\(Y\)) can be blocked by conditioning on the causal variable (\(T\))

In regression terms, this roughly translates to estimating three models:

**Model 1:** \(M_i = \alpha_1 + aT_i + e_{1,i}\) (equivalent to Step 1 in Pearl’s procedure but with one mediator)

**Model 2:** \(Y_i = \alpha_2 + cT_i + e_{2,i}\) (for the total effect of \(T\) on \(Y\))

**Model 3:** \(Y_i = \alpha_3 + dT_i + b M_i + e_{3,i}\) (equivalent to Step 2 in Pearl’s procedure where we condition on \(T\) to get the effect of \(M\) on \(Y\)).

Inputing equation 1 into 3 gives us:

\(Y_i = \alpha_3 + (d+ab)T_i + (\alpha_1 + e_{1,i})b + e_{3,i}\)

Where \(d\) is the direct effect of \(T\) on \(Y\) (through paths other than \(M\)), and \(a\cdot b\) is the indirect effect via \(M\). The total effect \(c = d + (a \cdot b)\).

- \(E[a_i \cdot b_i] \neq E[a_i]\cdot E[b_i]\)

In both Pearl’s approach and the Baron-Kenney equations, we estimate the *average* effect of \(T\) on \(M\), and \(M\) on \(Y\) and multiply these coefficients to get the indirect effect. However, these effects might vary across units: for every \(i\) \(a_i\) captures the effect of \(T\) on \(M\), and \(b_i\) the effect of \(M\) on \(Y\). For any given unit \(i\) the indirect effect is:

\(a_i \times b_i\)

And averaging over all units, we get:

\(E[a_i b_i] = E[a_i] \cdot E[b_i] + Cov[a_i,b_i] = (a \cdot b) + Cov[a_i,b_i]\)

Where \(a\) and \(b\) are regression coefficients from models 1 and 3, which when multiplied do *not* give the indirect effect unless \(Cov[a_i,b_i]= 0\) (which is only the case under constant treatment effects)

- Model 3 provides biased estimates for several reasons. First, we condition on a post-treatment variable (\(M_i\)). Second, \(M_i\) is non-randomly assigned and potentially related to unmeasured causes of \(Y_i\). Formally: \(M_i \not\!\perp\!\!\!\perp e_{3,i}\). Another way of saying this is that Pearl’s
*isolation condition*is violated because there is an active backdoor path even after conditioning on \(T_i\) : \(M_i \rightarrow e_{3,i} \rightarrow Y_i\)