A confidence interval tells you the range of plausible values for a population parameter given your sample. The classical approach (like a t-interval for the mean) uses a formula that assumes normally distributed data. Bootstrap confidence intervals take a different approach: they resample the data with replacement thousands of times, compute the statistic on each resample, and use the spread of those resampled values to estimate uncertainty. This works for *any* statistic — medians, ratios, correlation coefficients, model parameters — without needing a closed-form formula or normality assumption. It's especially valuable for small or non-normal samples. ### Bootstrapping the Mean `scipy.stats.bootstrap` handles the resampling loop for you. You pass the data, the statistic function, and the number of resamples.
import numpy as np
from scipy import stats
np.random.seed(22)
data = np.random.normal(loc=50, scale=8, size=80)
bootstrap_result = stats.bootstrap((data,), np.mean, confidence_level=0.95, n_resamples=5000, random_state=22)
print(f"Sample mean: {data.mean():.3f}")
print(f"95% bootstrap CI: ({bootstrap_result.confidence_interval.low:.3f}, {bootstrap_result.confidence_interval.high:.3f})")- `(data,)` is passed as a tuple because `bootstrap` accepts multiple datasets for statistics that take more than one sample (like correlation). - `n_resamples=5000` means 5000 resamples are drawn — more resamples give a more stable interval at the cost of computation time. - `bootstrap_result.confidence_interval.low` and `.high` are the lower and upper bounds of the interval. ### Comparing Bootstrap and t-Based Intervals For normally distributed data with a reasonable sample size, bootstrap and t-based intervals should be very close. The value of bootstrap shows up more clearly with small or non-normal samples.
import numpy as np
from scipy import stats
np.random.seed(22)
data = np.random.normal(loc=50, scale=8, size=80)
bootstrap_result = stats.bootstrap((data,), np.mean, confidence_level=0.95, n_resamples=5000, random_state=22)
t_interval = stats.t.interval(
confidence=0.95,
df=len(data) - 1,
loc=data.mean(),
scale=stats.sem(data),
)
print(f"Bootstrap CI: ({bootstrap_result.confidence_interval.low:.3f}, {bootstrap_result.confidence_interval.high:.3f})")
print(f"t-based CI: ({t_interval[0]:.3f}, {t_interval[1]:.3f})")- `stats.sem(data)` computes the standard error of the mean, which the t-interval formula needs. - Close agreement between the two intervals here is expected — the data are normally distributed and the sample is large enough for the t-interval formula to work well. - If you tried this with skewed data (e.g., `np.random.exponential`), the two intervals would diverge more, and the bootstrap interval would generally be the more trustworthy one. ### Visualizing the Bootstrap Distribution Plotting the bootstrap distribution of the statistic shows the full shape of the uncertainty — not just the interval endpoints — and helps you see whether the distribution is symmetric or skewed.
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(22)
data = np.random.normal(loc=50, scale=8, size=80)
boot_means = []
for _ in range(4000):
sample = np.random.choice(data, size=len(data), replace=True)
boot_means.append(sample.mean())
boot_means = np.array(boot_means)
ci_low, ci_high = np.percentile(boot_means, [2.5, 97.5])
plt.figure(figsize=(9, 5))
plt.hist(boot_means, bins=35, alpha=0.75)
plt.axvline(data.mean(), color="red", linestyle="--", label="Sample mean")
plt.axvline(ci_low, color="green", linestyle="--", label="95% CI")
plt.axvline(ci_high, color="green", linestyle="--")
plt.title("Bootstrap Distribution of the Mean")
plt.xlabel("Mean")
plt.ylabel("Count")
plt.legend()
plt.show()- `np.random.choice(data, size=len(data), replace=True)` draws a resample the same size as the original, with replacement — some observations will appear multiple times, others not at all. - `np.percentile(boot_means, [2.5, 97.5])` cuts off the bottom and top 2.5% of resampled means to form the 95% percentile interval — this is the simplest bootstrap interval method. - A symmetric, bell-shaped bootstrap distribution (as expected here) confirms the t-interval formula would also work well. ### Practical Example: Average Delivery Time Bootstrap intervals are useful whenever you want to report uncertainty without making strong distributional assumptions — for example, estimating average delivery time from operational data.
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
np.random.seed(37)
delivery_times = np.random.normal(loc=32, scale=4.5, size=60)
bootstrap_result = stats.bootstrap((delivery_times,), np.mean, confidence_level=0.95, n_resamples=4000, random_state=37)
print(f"Average delivery time: {delivery_times.mean():.2f} minutes")
print(
"95% bootstrap CI: "
f"({bootstrap_result.confidence_interval.low:.2f}, {bootstrap_result.confidence_interval.high:.2f}) minutes"
)
plt.figure(figsize=(8, 5))
plt.hist(delivery_times, bins=15, alpha=0.7)
plt.axvline(delivery_times.mean(), color="crimson", linestyle="--", linewidth=2)
plt.title("Observed Delivery Times")
plt.xlabel("Minutes")
plt.ylabel("Count")
plt.show()- The CI here says: given this sample, the true average delivery time is likely within a certain range with 95% confidence. - The histogram helps communicate the spread of raw delivery times, which is different from the uncertainty in the *mean* — the CI answers "how precisely do we know the average?", not "how spread out are the deliveries?". ### Conclusion Bootstrap confidence intervals are one of the most broadly applicable tools in statistics — they require minimal assumptions and work for statistics that have no closed-form interval. SciPy's `bootstrap` function makes them easy to apply to any statistic. For a related approach to measuring uncertainty in regression slope estimates, see [linear regression with SciPy](/tutorials/linear-regression-with-scipy). For testing whether two groups differ without normality assumptions, see the [Mann-Whitney U test](/tutorials/mann-whitney-u-test-with-scipy).