One-Way ANOVA with SciPy

When you need to compare more than two groups, running a separate [t-test](/tutorials/independent-samples-t-test-with-scipy) for every pair inflates the chance of a false positive — with five groups and ten comparisons at alpha=0.05, you'd expect about one spurious significant result by chance alone. One-way ANOVA solves this by testing all groups simultaneously with a single F-statistic. The F-statistic is the ratio of variance between groups to variance within groups: a large value means the group means are spread further apart than the individual measurements within each group, which is evidence they're not all equal. A significant ANOVA result tells you that *at least one* group differs — not which ones. Post-hoc tests like Tukey's HSD (available in `statsmodels`) are needed for pairwise follow-up.

### Basic One-Way ANOVA

ANOVA compares the spread of group means against the spread of measurements within each group. Here we simulate three groups with different means to produce a clear significant result.

import numpy as np
from scipy import stats

np.random.seed(10)

group_a = np.random.normal(loc=68, scale=4, size=30)
group_b = np.random.normal(loc=73, scale=4, size=30)
group_c = np.random.normal(loc=79, scale=4, size=30)

f_statistic, p_value = stats.f_oneway(group_a, group_b, group_c)

print(f"Mean of group A: {group_a.mean():.2f}")
print(f"Mean of group B: {group_b.mean():.2f}")
print(f"Mean of group C: {group_c.mean():.2f}")
print(f"F-statistic: {f_statistic:.3f}")
print(f"p-value: {p_value:.6f}")

Mean of group A: 68.79
Mean of group B: 73.47
Mean of group C: 78.71
F-statistic: 47.591
p-value: 0.000000

- `stats.f_oneway(group_a, group_b, group_c)` accepts any number of groups as separate arguments and tests the null hypothesis that all group means are equal.
- A large F-statistic means between-group variation dominates within-group variation — the group means are further apart than the noise within each group.
- The p-value tests whether an F-statistic this large could plausibly arise if the groups were actually equal.

### Interpreting the Result

The F-test is an omnibus test — a significant result is the starting point for further investigation, not a final conclusion about which specific groups differ.

import numpy as np
from scipy import stats

np.random.seed(10)

group_a = np.random.normal(loc=68, scale=4, size=30)
group_b = np.random.normal(loc=73, scale=4, size=30)
group_c = np.random.normal(loc=79, scale=4, size=30)

f_statistic, p_value = stats.f_oneway(group_a, group_b, group_c)

if p_value < 0.05:
    print("Reject the null hypothesis: at least one group mean is different.")
else:
    print("Fail to reject the null hypothesis: the data do not show a significant difference.")

Reject the null hypothesis: at least one group mean is different.

- A significant p-value here only tells you the groups are not all equal — it doesn't identify which pairs differ.
- If the p-value is significant and you need to know *which* groups differ, run a post-hoc test such as Tukey's HSD from `statsmodels`.

### Visualizing the Groups

Box plots are the natural companion to ANOVA: they show the center and spread of each group at a glance, making it easy to see whether the group means are well-separated relative to the within-group spread.

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(10)

group_a = np.random.normal(loc=68, scale=4, size=30)
group_b = np.random.normal(loc=73, scale=4, size=30)
group_c = np.random.normal(loc=79, scale=4, size=30)

plt.figure(figsize=(9, 5))
plt.boxplot([group_a, group_b, group_c], tick_labels=["A", "B", "C"])
plt.ylabel("Score")
plt.title("Three Independent Groups")
plt.grid(axis="y", linestyle="--", alpha=0.4)
plt.show()

- Well-separated boxes with little overlap correspond to a large F-statistic; heavily overlapping boxes correspond to a small one.
- The equal scale (variance) across groups visible here is the key ANOVA assumption — if one box were much taller than the others, the equal-variance assumption would be questionable.

### Practical Example: Comparing Three Training Programs

This example tests whether three employee training programs produce different average performance outcomes.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

np.random.seed(21)

program_a = np.random.normal(loc=71, scale=5, size=25)
program_b = np.random.normal(loc=75, scale=5, size=25)
program_c = np.random.normal(loc=81, scale=5, size=25)

result = stats.f_oneway(program_a, program_b, program_c)

print(f"Program means: {[round(x.mean(), 2) for x in [program_a, program_b, program_c]]}")
print(f"F-statistic: {result.statistic:.3f}")
print(f"p-value: {result.pvalue:.6f}")
print("Conclusion: significant difference across programs." if result.pvalue < 0.05 else "Conclusion: no significant difference detected.")

plt.figure(figsize=(9, 5))
plt.violinplot([program_a, program_b, program_c], showmeans=True)
plt.xticks([1, 2, 3], ["Program A", "Program B", "Program C"])
plt.ylabel("Outcome")
plt.title("Training Program Outcomes")
plt.grid(axis="y", linestyle=":", alpha=0.4)
plt.show()

Program means: [np.float64(69.9), np.float64(75.93), np.float64(81.67)]
F-statistic: 41.941
p-value: 0.000000
Conclusion: significant difference across programs.

- `result.statistic` and `result.pvalue` access the result fields without unpacking — both approaches work.
- A violin plot shows the full distribution shape rather than just the quartiles, which is useful when you suspect the distributions aren't symmetric.
- `showmeans=True` adds a marker at the mean of each group, making it easier to compare the means visually alongside the ANOVA result.

### Conclusion

One-way ANOVA is the right tool when you want to test three or more groups without inflating the false positive rate from multiple comparisons. A significant F-statistic tells you *something* differs — follow it up with a post-hoc test and a visualization to identify and communicate what.

To compare just two groups, see the [independent samples t-test](/tutorials/independent-samples-t-test-with-scipy). If your data clearly violates normality, the [Kruskal-Wallis test](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kruskal.html) is a nonparametric alternative to one-way ANOVA.