Seasonality Analysis

Knowing that a retail store is busiest in December is useful; knowing *how much* busier — and whether that pattern is stable from year to year — is far more useful for planning. Seasonality analysis goes beyond noticing a cycle: it measures the amplitude of the seasonal component relative to the overall variation, tests whether the pattern is consistent across years, and extracts the seasonal profile so you can forecast or seasonally adjust the series. The tools build on time series decomposition but focus specifically on characterising the repeating structure.

### Simulating multi-year seasonal data

Three years of monthly data with a known seasonal pattern and moderate noise, stored as a Pandas Series with a DatetimeIndex.

import numpy as np
import pandas as pd

rng = np.random.default_rng(42)
n_months = 36  # 3 years
t = np.arange(n_months)

# Monthly seasonal pattern (peak in July, trough in January)
seasonal_pattern = np.array([
    -8, -5, -2, 2, 5, 8, 10, 8, 4, 0, -4, -8
] * 3)

trend = 100 + 0.3 * t
noise = rng.normal(0, 4, n_months)
values = trend + seasonal_pattern + noise

dates = pd.date_range("2021-01", periods=n_months, freq="MS")
ts = pd.Series(values, index=dates, name="monthly_sales")
print(ts.resample("YE").mean().round(1))

2021-12-31    101.9
2022-12-31    106.5
2023-12-31    110.7
Freq: YE-DEC, Name: monthly_sales, dtype: float64

- `seasonal_pattern` hard-codes the true monthly effect so you can verify that the analysis recovers it. Real-world seasonal patterns are estimated from data.
- `resample("YE").mean()` computes annual averages — the values should increase slightly due to the 0.3-unit monthly trend.
- The signal-to-noise ratio (seasonal amplitude ≈ 18, noise std = 4) is realistic for consumer-facing monthly data.

### Extracting the seasonal component with STL

STL decomposition isolates the seasonal component cleanly even with a trend present. The seasonal sub-series plot shows the average value for each month across all years.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import STL

rng = np.random.default_rng(42)
n_months = 36
t = np.arange(n_months)
seasonal_pattern = np.array([-8,-5,-2,2,5,8,10,8,4,0,-4,-8] * 3)
ts = pd.Series(100 + 0.3*t + seasonal_pattern + rng.normal(0, 4, n_months),
               index=pd.date_range("2021-01", periods=n_months, freq="MS"))

stl = STL(ts, period=12, robust=True)
result = stl.fit()

# Average seasonal effect per month
monthly_avg = result.seasonal.groupby(result.seasonal.index.month).mean()
month_names = ["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]

fig, axes = plt.subplots(1, 2, figsize=(12, 4))
result.seasonal.plot(ax=axes[0], color="steelblue")
axes[0].axhline(0, color="gray", linestyle="--", linewidth=0.8)
axes[0].set_title("Seasonal component over time")
axes[0].set_ylabel("Seasonal effect")

axes[1].bar(range(1, 13), monthly_avg.values, color="steelblue", alpha=0.8)
axes[1].set_xticks(range(1, 13))
axes[1].set_xticklabels(month_names, rotation=45)
axes[1].axhline(0, color="gray", linestyle="--", linewidth=0.8)
axes[1].set_title("Average seasonal profile by month")
axes[1].set_ylabel("Average seasonal effect")

plt.tight_layout()
plt.show()

- `result.seasonal.groupby(result.seasonal.index.month).mean()` averages the extracted seasonal values for each calendar month across all years.
- Positive bars indicate months where sales are above the deseasonalised level; negative bars indicate below-average months.
- Comparing the extracted profile to the known `seasonal_pattern` array verifies that STL is recovering the true signal accurately.

### Measuring seasonal strength

Seasonal strength quantifies what fraction of the non-trend variance is due to the seasonal component — values close to 1 mean the data is highly seasonal.

import numpy as np
import pandas as pd
from statsmodels.tsa.seasonal import STL

rng = np.random.default_rng(42)
n_months = 36
t = np.arange(n_months)
seasonal_pattern = np.array([-8,-5,-2,2,5,8,10,8,4,0,-4,-8] * 3)
ts = pd.Series(100 + 0.3*t + seasonal_pattern + rng.normal(0, 4, n_months),
               index=pd.date_range("2021-01", periods=n_months, freq="MS"))

stl = STL(ts, period=12, robust=True)
result = stl.fit()

var_resid    = result.resid.var()
var_seas_plus_resid = (result.seasonal + result.resid).var()
var_trend_plus_resid = (result.trend + result.resid).var()

f_seasonal = max(0, 1 - var_resid / var_seas_plus_resid)
f_trend    = max(0, 1 - var_resid / var_trend_plus_resid)

print(f"Seasonal strength: {f_seasonal:.3f}  (1 = purely seasonal, 0 = no seasonality)")
print(f"Trend strength:    {f_trend:.3f}  (1 = purely trending)")
print(f"\nSeasonal amplitude: {result.seasonal.max() - result.seasonal.min():.2f}")
print(f"Noise std:          {result.resid.std():.2f}")

Seasonal strength: 0.887  (1 = purely seasonal, 0 = no seasonality)
Trend strength:    0.755  (1 = purely trending)

Seasonal amplitude: 27.95
Noise std:          2.18

- Seasonal strength follows Hyndman & Athanasopoulos: `F_S = max(0, 1 − Var(R) / Var(S + R))`. It ranges from 0 to 1.
- A value above 0.60 is typically considered "strong" seasonality — the series has a predictable repeating pattern worth modelling explicitly.
- The amplitude-to-noise ratio (seasonal amplitude / residual std) is a complementary practical metric: high ratios mean the seasonal pattern is easily detectable.

### Conclusion

Seasonality analysis answers two questions: how large is the seasonal effect, and how consistent is it? STL gives you the seasonal component directly; strength metrics give you a single number to compare across series or track over time. Always visualise the seasonal profile by calendar unit (month, weekday, hour) — the shape often reveals actionable insight that a single strength coefficient obscures.

For forecasting future values using the extracted components, see [exponential smoothing with statsmodels](/tutorials/exponential-smoothing-statsmodels). For the full decomposition workflow including trend and residual, see [time series decomposition](/tutorials/time-series-decomposition-statsmodels).