Tutorials

Independent Samples t-Test with SciPy

An independent samples t-test is used when you want to compare the means of two separate groups. A common example is comparing a control group and a treatment group to see whether the treatment appears to change the outcome.

This tutorial shows how to run an independent samples t-test with SciPy, interpret the output, visualize the data, and understand when to use Welch's t-test instead of the equal-variance version.

### Basic Independent Samples t-Test

Let's start with two independent groups whose means are visibly different.

import numpy as np
from scipy import stats

np.random.seed(42)

# Simulated scores for two independent groups
group_a = np.random.normal(loc=72, scale=6, size=40)
group_b = np.random.normal(loc=78, scale=6, size=40)

t_statistic, p_value = stats.ttest_ind(group_a, group_b)

print(f"Mean of group A: {group_a.mean():.2f}")
print(f"Mean of group B: {group_b.mean():.2f}")
print(f"t-statistic: {t_statistic:.3f}")
print(f"p-value: {p_value:.6f}")
Mean of group A: 70.69
Mean of group B: 77.83
t-statistic: -5.549
p-value: 0.000000
- **`stats.ttest_ind(group_a, group_b)`** compares the means of the two groups.
- A small p-value suggests the observed difference is unlikely under the null hypothesis of equal means.

### Interpreting the Result

The t-statistic shows the direction and magnitude of the standardized difference, while the p-value helps decide whether the difference is statistically significant.

import numpy as np
from scipy import stats

np.random.seed(42)

group_a = np.random.normal(loc=72, scale=6, size=40)
group_b = np.random.normal(loc=78, scale=6, size=40)

t_statistic, p_value = stats.ttest_ind(group_a, group_b)
alpha = 0.05

print(f"t-statistic: {t_statistic:.3f}")
print(f"p-value: {p_value:.6f}")

if p_value < alpha:
    print("Reject the null hypothesis: the group means differ significantly.")
else:
    print("Fail to reject the null hypothesis: the data do not show a significant difference.")
t-statistic: -5.549
p-value: 0.000000
Reject the null hypothesis: the group means differ significantly.
- If the p-value is below your significance threshold, the sample provides evidence that the means differ.
- In this example, the groups were generated with different means, so a significant result is the intended outcome.

### Visualizing the Two Groups

A chart helps you see the group distributions before or alongside the statistical result.

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)

group_a = np.random.normal(loc=72, scale=6, size=40)
group_b = np.random.normal(loc=78, scale=6, size=40)

plt.figure(figsize=(9, 5))
plt.boxplot([group_a, group_b], tick_labels=["Group A", "Group B"])
plt.ylabel("Score")
plt.title("Comparison of Two Independent Groups")
plt.grid(axis="y", linestyle="--", alpha=0.4)
plt.show()
- The box plot makes the difference in center and spread easier to inspect.
- This is useful context for interpreting the t-test result.

### Using Welch's t-Test

The standard independent samples t-test assumes equal variances. If that assumption is questionable, Welch's t-test is often a safer choice.

import numpy as np
from scipy import stats

np.random.seed(7)

group_a = np.random.normal(loc=72, scale=4, size=35)
group_b = np.random.normal(loc=77, scale=9, size=35)

equal_var_test = stats.ttest_ind(group_a, group_b, equal_var=True)
welch_test = stats.ttest_ind(group_a, group_b, equal_var=False)

print("Standard t-test:")
print(f"  t-statistic: {equal_var_test.statistic:.3f}")
print(f"  p-value: {equal_var_test.pvalue:.6f}")

print("Welch's t-test:")
print(f"  t-statistic: {welch_test.statistic:.3f}")
print(f"  p-value: {welch_test.pvalue:.6f}")
Standard t-test:
  t-statistic: -2.673
  p-value: 0.009406
Welch's t-test:
  t-statistic: -2.673
  p-value: 0.010341
- **`equal_var=False`** tells SciPy to run Welch's t-test.
- Welch's t-test is generally preferred when the groups may have unequal variances.

### Practical Example: Comparing Two Teaching Methods

Here is a practical example comparing exam scores from two classes that used different teaching methods.

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

np.random.seed(123)

# Class A used method A, class B used method B
class_a = np.random.normal(loc=74, scale=5, size=30)
class_b = np.random.normal(loc=80, scale=5, size=30)

result = stats.ttest_ind(class_a, class_b, equal_var=False)

print(f"Average score in class A: {class_a.mean():.2f}")
print(f"Average score in class B: {class_b.mean():.2f}")
print(f"Welch's t-statistic: {result.statistic:.3f}")
print(f"Welch's p-value: {result.pvalue:.6f}")

if result.pvalue < 0.05:
    print("Conclusion: the difference in average scores is statistically significant.")
else:
    print("Conclusion: the score difference is not statistically significant.")

plt.figure(figsize=(9, 5))
plt.hist(class_a, bins=8, alpha=0.6, label="Class A")
plt.hist(class_b, bins=8, alpha=0.6, label="Class B")
plt.xlabel("Exam score")
plt.ylabel("Count")
plt.title("Score Distributions for Two Teaching Methods")
plt.legend()
plt.show()
Average score in class A: 74.22
Average score in class B: 80.71
Welch's t-statistic: -4.150
Welch's p-value: 0.000110
Conclusion: the difference in average scores is statistically significant.
- The histogram helps confirm that class B tends to score higher in this simulated example.
- The test result is intentionally set up to detect that difference, so the printed conclusion should match the visual pattern.

### Conclusion

SciPy makes it easy to compare two independent groups with `ttest_ind`. By combining the t-test with a visualization and using Welch's version when variances may differ, you can produce a clearer and more defensible comparison.