Kolmogorov-Smirnov Test with SciPy

Most statistical tests focus on a single summary — the mean, the median, or a rank comparison. The Kolmogorov-Smirnov (KS) test is different: it compares entire distributions. It works by computing the cumulative distribution function (CDF) of each sample — the running proportion of values at or below each point — and measuring the largest vertical gap between the two CDFs. A small gap means the distributions are similar everywhere; a large gap means they diverge significantly at some point. The KS test comes in two forms: the two-sample version compares two empirical distributions, and the one-sample version compares a sample to a known theoretical distribution like the normal.

### Two-Sample KS Test

The two-sample test is useful when you want to know whether two datasets come from the same distribution — not just whether their means or medians differ.

import numpy as np
from scipy import stats

np.random.seed(52)

sample_a = np.random.normal(loc=0, scale=1, size=120)
sample_b = np.random.normal(loc=0.6, scale=1.1, size=120)

result = stats.ks_2samp(sample_a, sample_b)

print(f"KS statistic: {result.statistic:.3f}")
print(f"P-value: {result.pvalue:.6f}")

KS statistic: 0.200
P-value: 0.016260

- `stats.ks_2samp(sample_a, sample_b)` finds the maximum absolute difference between the empirical CDFs of the two samples.
- A KS statistic of 0 would mean the two CDFs are identical everywhere; a statistic of 1 would mean they don't overlap at all.
- The two samples here have slightly different means and standard deviations — the KS test can detect this kind of combined shift even when neither difference alone would be obvious.

### Interpreting the Result

Unlike the t-test, a significant KS result doesn't tell you *where* the distributions differ — only that they differ *somewhere*. The plot (in the next section) shows where.

import numpy as np
from scipy import stats

np.random.seed(52)

sample_a = np.random.normal(loc=0, scale=1, size=120)
sample_b = np.random.normal(loc=0.6, scale=1.1, size=120)

result = stats.ks_2samp(sample_a, sample_b)

if result.pvalue < 0.05:
    print("Reject the null hypothesis: the samples come from different distributions.")
else:
    print("Fail to reject the null hypothesis: the samples could come from the same distribution.")

Reject the null hypothesis: the samples come from different distributions.

- The null hypothesis is that both samples come from the same underlying distribution — a significant result means that's unlikely given the data.
- The KS test is sensitive to differences in shape (skew, spread) as well as location (mean), making it more comprehensive than a pure mean comparison.

### Visualizing the Samples

Overlapping density histograms help you see where the two distributions differ — the region where they diverge most corresponds to the largest CDF gap the KS statistic captures.

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(52)

sample_a = np.random.normal(loc=0, scale=1, size=120)
sample_b = np.random.normal(loc=0.6, scale=1.1, size=120)

plt.figure(figsize=(9, 5))
plt.hist(sample_a, bins=18, alpha=0.6, density=True, label="Sample A")
plt.hist(sample_b, bins=18, alpha=0.6, density=True, label="Sample B")
plt.title("Two Samples Compared with KS Test")
plt.xlabel("Value")
plt.ylabel("Density")
plt.legend()
plt.show()

- `density=True` normalizes both histograms to the same scale, making direct comparison valid even if the sample sizes differ.
- The rightward shift and slightly wider spread of sample B are visible here — both contribute to the KS statistic.

### One-Sample KS Test Against a Normal Distribution

The one-sample version tests whether a single dataset is consistent with a specified theoretical distribution — useful as a [normality check](/tutorials/normality-tests-with-scipy) or to validate model assumptions.

import numpy as np
from scipy import stats

np.random.seed(70)

sample = np.random.normal(loc=0, scale=1, size=100)
result = stats.kstest(sample, "norm")

print(f"One-sample KS statistic: {result.statistic:.3f}")
print(f"P-value: {result.pvalue:.6f}")

One-sample KS statistic: 0.084
P-value: 0.453159

- `stats.kstest(sample, "norm")` compares the sample's empirical CDF to a standard normal distribution (mean=0, std=1).
- A high p-value here means the sample is consistent with a standard normal — expected since we drew it from `np.random.normal(0, 1)`.
- You can test against other distributions by passing their names: `"expon"`, `"uniform"`, `"lognorm"`, etc. — see the SciPy documentation for the full list.

### Practical Example: Comparing Load Time Distributions

Here we use the KS test to check whether a website optimization actually changed the load time distribution, not just the average.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

np.random.seed(83)

before = np.random.lognormal(mean=1.9, sigma=0.25, size=100)
after = np.random.lognormal(mean=1.7, sigma=0.20, size=100)

result = stats.ks_2samp(before, after)

print(f"KS statistic: {result.statistic:.3f}")
print(f"P-value: {result.pvalue:.6f}")
print("Conclusion: load-time distributions differ significantly." if result.pvalue < 0.05 else "Conclusion: no significant distribution difference detected.")

plt.figure(figsize=(9, 5))
plt.hist(before, bins=14, alpha=0.6, density=True, label="Before")
plt.hist(after, bins=14, alpha=0.6, density=True, label="After")
plt.xlabel("Load time")
plt.ylabel("Density")
plt.title("Before vs After Load Time Distributions")
plt.legend()
plt.show()

KS statistic: 0.380
P-value: 0.000001
Conclusion: load-time distributions differ significantly.

- Log-normal data is a common model for load times — most requests are fast, with a tail of slow ones.
- The `after` distribution has a lower mean and smaller spread, simulating a genuine improvement. The KS test detects whether this improvement is statistically significant across the full distribution.
- If the KS test is significant but the means are similar, it means the *shape* of the distribution changed — for example, the slow tail was reduced without moving the average.

### Conclusion

The KS test is the right choice when you want to compare whole distributions rather than just their centers. It makes no assumptions about distribution shape, works on any continuous data, and is sensitive to differences in location, spread, and shape simultaneously.

For mean comparisons, see the [independent samples t-test](/tutorials/independent-samples-t-test-with-scipy) or [Mann-Whitney U test](/tutorials/mann-whitney-u-test-with-scipy). For checking normality specifically, see [normality tests with SciPy](/tutorials/normality-tests-with-scipy).