Real-world signals — sensor readings, stock prices, audio waveforms, biological measurements — always contain noise. Filtering removes or reduces that noise so you can see the underlying trend or detect meaningful events. Python's NumPy and SciPy libraries provide several filtering approaches, each with different trade-offs: some preserve peak shape, some cut specific frequencies, and some handle isolated spikes. Choosing the right filter depends on what your signal looks like and what you need to do with it. ### Moving Average A moving average replaces each sample with the mean of its surrounding window. It's the simplest smoothing method and works well when noise is random and the signal changes slowly. The wider the window, the smoother the result — but very wide windows blur sharp features.
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng(0)
x = np.linspace(0, 4 * np.pi, 300)
signal = np.sin(x) + 0.5 * rng.standard_normal(300)
window = 20
kernel = np.ones(window) / window
smoothed = np.convolve(signal, kernel, mode="same")
plt.figure(figsize=(9, 3.5))
plt.plot(x, signal, color="steelblue", alpha=0.5, linewidth=1, label="Noisy signal")
plt.plot(x, smoothed, color="crimson", linewidth=2, label=f"Moving average (window={window})")
plt.title("Moving Average Smoothing")
plt.xlabel("x")
plt.ylabel("Amplitude")
plt.legend()
plt.tight_layout()
plt.show()- `np.ones(window) / window` creates a uniform kernel — each sample in the window gets equal weight. - `np.convolve(..., mode="same")` slides the kernel across the signal and returns an output the same length as the input. - The `alpha=0.5` on the raw signal makes the noisy trace semi-transparent so the smoothed line is easy to read. ### Gaussian Smoothing A Gaussian filter weights nearby samples more heavily than distant ones, following a bell curve. This produces a smoother result than a moving average and avoids the "ringing" artifacts that a hard-edged window can introduce. The `sigma` parameter controls the width of the Gaussian — higher values give more smoothing.
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import gaussian_filter1d
rng = np.random.default_rng(0)
x = np.linspace(0, 4 * np.pi, 300)
signal = np.sin(x) + 0.5 * rng.standard_normal(300)
fig, axes = plt.subplots(1, 3, figsize=(12, 3.5), sharey=True)
for ax, sigma in zip(axes, [2, 5, 10]):
smoothed = gaussian_filter1d(signal, sigma=sigma)
ax.plot(x, signal, color="steelblue", alpha=0.4, linewidth=1)
ax.plot(x, smoothed, color="crimson", linewidth=2)
ax.set_title(f"sigma = {sigma}")
ax.set_xlabel("x")
axes[0].set_ylabel("Amplitude")
fig.suptitle("Gaussian Smoothing at Different Sigma Values", y=1.02)
plt.tight_layout()
plt.show()- `gaussian_filter1d(signal, sigma=sigma)` applies a 1D Gaussian blur along the array. - `sharey=True` links all three panels to the same y-axis scale, making the effect of increasing sigma easy to compare. - `fig.suptitle(..., y=1.02)` places the overall title slightly above the subplots to avoid overlapping them. ### Savitzky-Golay Filter The Savitzky-Golay filter fits a low-degree polynomial to a sliding window of points. Unlike simple averaging, it preserves peaks and edges in the signal rather than flattening them. This makes it popular in spectroscopy, chromatography, and any domain where peak shape matters.
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import savgol_filter
rng = np.random.default_rng(1)
x = np.linspace(0, 4 * np.pi, 300)
signal = np.sin(x) + 0.4 * rng.standard_normal(300)
smoothed_ma = np.convolve(signal, np.ones(21) / 21, mode="same")
smoothed_sg = savgol_filter(signal, window_length=21, polyorder=3)
plt.figure(figsize=(9, 4))
plt.plot(x, signal, color="steelblue", alpha=0.4, linewidth=1, label="Noisy signal")
plt.plot(x, smoothed_ma, color="orange", linewidth=2, linestyle="--", label="Moving average (w=21)")
plt.plot(x, smoothed_sg, color="crimson", linewidth=2, label="Savitzky-Golay (w=21, poly=3)")
plt.title("Savitzky-Golay vs Moving Average")
plt.xlabel("x")
plt.ylabel("Amplitude")
plt.legend()
plt.tight_layout()
plt.show()- `window_length=21` is the number of samples in each fitting window; it must be odd. - `polyorder=3` fits a cubic polynomial — higher order preserves more detail but is less aggressive about smoothing. - Comparing the two curves shows that the Savitzky-Golay filter (red) tracks the sine peaks more faithfully than the moving average (orange dashed), which slightly lags and underestimates peak heights. ### Butterworth Low-Pass Filter A Butterworth filter operates in the frequency domain: it passes low-frequency components (the slow trend) and attenuates high-frequency components (the noise). You specify a cutoff frequency and the filter design handles the rest. `sosfiltfilt` applies the filter forward then backward, which cancels out any phase shift so the filtered output stays aligned with the original signal.
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import butter, sosfiltfilt
rng = np.random.default_rng(2)
fs = 300 # samples per second (sampling frequency)
t = np.linspace(0, 1, fs, endpoint=False)
signal = np.sin(2 * np.pi * 3 * t) + 0.5 * rng.standard_normal(fs)
# Design a 4th-order Butterworth low-pass filter with cutoff at 8 Hz
sos = butter(N=4, Wn=8, btype="low", fs=fs, output="sos")
smoothed = sosfiltfilt(sos, signal)
plt.figure(figsize=(9, 3.5))
plt.plot(t, signal, color="steelblue", alpha=0.5, linewidth=1, label="Noisy signal (3 Hz + noise)")
plt.plot(t, smoothed, color="crimson", linewidth=2, label="Butterworth low-pass (cutoff=8 Hz)")
plt.title("Butterworth Low-Pass Filter")
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")
plt.legend()
plt.tight_layout()
plt.show()
- `butter(N=4, Wn=8, btype="low", fs=fs, output="sos")` designs the filter: `N=4` is the order (steeper roll-off with higher N), `Wn=8` is the cutoff frequency in Hz, and `output="sos"` returns second-order sections for numerical stability.
- `sosfiltfilt` applies the filter twice — once forward, once backward — so the phase delays cancel and the output is perfectly aligned with the input. The trade-off is that it needs the signal to be several times longer than the filter order.
- The underlying 3 Hz sine wave is recovered cleanly because it's well below the 8 Hz cutoff.
### Median Filter
A median filter replaces each sample with the median of its window rather than the mean. This makes it exceptionally effective at removing isolated spikes ("salt-and-pepper" noise) without blurring the rest of the signal, because a single outlier in a window can't shift the median the way it shifts the mean.
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import medfilt
rng = np.random.default_rng(3)
x = np.linspace(0, 4 * np.pi, 300)
signal = np.sin(x) + 0.15 * rng.standard_normal(300)
# Add random spikes
spike_idx = rng.choice(300, size=20, replace=False)
signal[spike_idx] += rng.choice([-4, 4], size=20)
smoothed = medfilt(signal, kernel_size=7)
plt.figure(figsize=(9, 3.5))
plt.plot(x, signal, color="steelblue", alpha=0.6, linewidth=1, label="Signal with spikes")
plt.plot(x, smoothed, color="crimson", linewidth=2, label="Median filter (kernel=7)")
plt.title("Median Filter — Spike Removal")
plt.xlabel("x")
plt.ylabel("Amplitude")
plt.legend()
plt.tight_layout()
plt.show()- `rng.choice([-4, 4], size=20)` injects 20 random positive or negative spikes into the signal. - `medfilt(signal, kernel_size=7)` replaces each sample with the median of its 7-sample window; `kernel_size` must be odd. - The spikes (visible as sharp vertical jumps in the blue trace) are almost entirely removed in the red trace, while the smooth sine baseline is well preserved. Filtering is not one-size-fits-all: use a moving average or Gaussian filter for general noise reduction, Savitzky-Golay when you need to preserve peak shape, a Butterworth filter when you know your signal's frequency content, and a median filter when your data has isolated outlier spikes. Next, learn about [finding peaks with SciPy](/tutorials/find-peaks-with-scipy) to detect features in a cleaned-up signal.