Moderation Analysis

A moderator changes the *strength* or *direction* of a relationship. For example, the effect of practice hours on exam scores might be stronger for students with high prior knowledge than for those starting from scratch — prior knowledge moderates the practice-performance relationship. In regression terms, moderation is an interaction: you add a product term X × Z to the model and test whether its coefficient is significantly non-zero. A significant interaction means the slope of Y on X is different at different values of Z. Simple slopes — plotting the X–Y relationship at specific Z levels — make the interaction interpretable.

### Simulating a moderated relationship

Study time (X) improves test scores (Y), but the effect is amplified by prior knowledge (Z) — low-knowledge students gain less from studying than high-knowledge ones.

import numpy as np
import pandas as pd

rng = np.random.default_rng(42)
n = 300

X = rng.normal(0, 1, n)   # study hours (centred)
Z = rng.normal(0, 1, n)   # prior knowledge (centred)

# Interaction: effect of X on Y grows with Z
Y = 2 * X + 1.5 * Z + 0.8 * X * Z + rng.normal(0, 1.5, n)

df = pd.DataFrame({"X": X, "Z": Z, "Y": Y})
print(f"Correlation X–Y: {df['X'].corr(df['Y']):.3f}")
print(f"Correlation Z–Y: {df['Z'].corr(df['Y']):.3f}")
print(f"Correlation X–Z: {df['X'].corr(df['Z']):.3f}")

Correlation X–Y: 0.645
Correlation Z–Y: 0.478
Correlation X–Z: 0.033

- Both X and Z are already centred (mean ≈ 0), which is important: when an interaction is present, centering prevents the main effect coefficients from being uninterpretable.
- The interaction coefficient of 0.8 means that each unit increase in Z amplifies the effect of X on Y by 0.8 — the key signal that moderation analysis should detect.
- X and Z are uncorrelated by construction, isolating the pure interaction without multicollinearity.

### Fitting the interaction model

Adding an `X:Z` term to the OLS formula tests the interaction. The coefficient on `X:Z` is the key estimate.

import numpy as np
import pandas as pd
import statsmodels.formula.api as smf

rng = np.random.default_rng(42)
n = 300
X = rng.normal(0, 1, n)
Z = rng.normal(0, 1, n)
Y = 2 * X + 1.5 * Z + 0.8 * X * Z + rng.normal(0, 1.5, n)
df = pd.DataFrame({"X": X, "Z": Z, "Y": Y})

model = smf.ols("Y ~ X + Z + X:Z", data=df).fit()
print(model.summary().tables[1])

==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     -0.1277      0.087     -1.461      0.145      -0.300       0.044
X              1.8876      0.094     20.089      0.000       1.703       2.073
Z              1.3073      0.086     15.172      0.000       1.138       1.477
X:Z            0.8646      0.093      9.264      0.000       0.681       1.048
==============================================================================

- `X:Z` in the formula adds only the interaction term (product of X and Z). Using `X*Z` would add `X + Z + X:Z` automatically — both are equivalent here since X and Z are already in the formula.
- The coefficient on `X:Z` (should be close to 0.8) tests whether the slope of X on Y changes significantly with Z. A significant p-value confirms moderation.
- Main effects in interaction models lose their simple interpretation: `X`'s coefficient is now the effect of X when Z = 0, not the average effect across all Z values.

### Simple slopes plot

Simple slopes show the X–Y relationship at specific Z values (typically −1 SD, mean, +1 SD), making the interaction tangible.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.formula.api as smf

rng = np.random.default_rng(42)
n = 300
X = rng.normal(0, 1, n)
Z = rng.normal(0, 1, n)
Y = 2 * X + 1.5 * Z + 0.8 * X * Z + rng.normal(0, 1.5, n)
df = pd.DataFrame({"X": X, "Z": Z, "Y": Y})

model = smf.ols("Y ~ X + Z + X:Z", data=df).fit()

x_range = np.linspace(df["X"].min(), df["X"].max(), 100)
z_levels = {"Z = −1 SD": -1.0, "Z = mean": 0.0, "Z = +1 SD": 1.0}
colors = ["steelblue", "gray", "tomato"]

fig, ax = plt.subplots(figsize=(8, 5))
for (label, z_val), color in zip(z_levels.items(), colors):
    pred_df = pd.DataFrame({"X": x_range, "Z": np.full(len(x_range), z_val)})
    y_hat = model.predict(pred_df)
    ax.plot(x_range, y_hat, color=color, linewidth=2, label=label)

ax.set_xlabel("Study time (X, centred)")
ax.set_ylabel("Test score (Y)")
ax.set_title("Simple slopes: effect of study time at different prior knowledge levels")
ax.legend()
plt.tight_layout()
plt.show()

- Each line is a simple slope: the predicted Y as a function of X while Z is fixed at a specific value.
- Converging or fan-shaped lines indicate moderation — the steeper the difference between lines, the stronger the interaction.
- If all three lines were parallel (same slope), the interaction coefficient would be zero and there would be no moderation.

### Conclusion

Moderation analysis adds a product term to a regression and asks whether the X → Y slope depends on Z. The interaction coefficient answers that directly; simple slopes plots make the pattern interpretable. Always centre your predictors before computing the interaction — centering prevents the main effect coefficients from becoming meaningless and reduces multicollinearity between the main effects and the product term.

For testing whether a third variable *carries* a relationship (mediation) rather than *moderating* it, see [mediation analysis with statsmodels](/tutorials/mediation-analysis-statsmodels). For the underlying OLS framework, see [multiple linear regression with scikit-learn](/tutorials/multiple-linear-regression-sklearn).