Most large matrices that arise in practice — finite-difference grids, graph adjacency matrices, recommendation systems, natural language term-document matrices — are sparse: nearly all entries are zero. Storing them as dense NumPy arrays wastes enormous amounts of memory and slows down computation. `scipy.sparse` stores only the non-zero values and their positions, making it practical to work with matrices of millions of rows. As a bonus, sparse linear solvers like `spsolve` exploit the sparsity structure to solve systems far faster than dense methods. ### Creating Sparse Matrices The most convenient way to build a sparse matrix from scratch is `diags` for diagonal/banded structures, or `csr_matrix` to convert from a dense array. CSR (Compressed Sparse Row) is the standard format for arithmetic and linear solves.
import scipy.sparse as sp
import numpy as np
# Convert a small dense matrix to CSR
dense = np.array([[4, 0, 0, 2],
[0, 3, 0, 0],
[1, 0, 5, 0],
[0, 0, 0, 6]], dtype=float)
A = sp.csr_matrix(dense)
print(A)
print(f"\nShape: {A.shape}, Non-zeros: {A.nnz}, Density: {A.nnz / A.shape[0]**2:.2f}")- `sp.csr_matrix(dense)` scans the dense array and stores only the 6 non-zero values — the rest are implicit zeros. - `A.nnz` is the number of non-zero entries. Density = nnz / (rows × cols); values close to 0 mean the matrix is very sparse. - Printing a CSR matrix shows `(row, col) value` tuples — the internal storage representation. ### Building Large Sparse Matrices Efficiently For large structured matrices — like the finite-difference Laplacian — `diags` is the right tool. It builds a sparse matrix from diagonals without ever allocating a dense array.
import scipy.sparse as sp
import numpy as np
import matplotlib.pyplot as plt
n = 50
# 1D finite-difference second-derivative matrix: [-1, 2, -1] tridiagonal
A = sp.diags([-np.ones(n-1), 2*np.ones(n), -np.ones(n-1)],
offsets=[-1, 0, 1], format="csr")
print(f"Shape: {A.shape}, Non-zeros: {A.nnz}, Density: {A.nnz/n**2:.5f}")
plt.figure(figsize=(5, 5))
plt.spy(A, markersize=3, color="steelblue")
plt.title(f"Sparsity Pattern ({n}×{n} tridiagonal)")
plt.tight_layout()
plt.show()- `sp.diags([...], offsets=[...])` places each array on the specified diagonal: 0 is the main diagonal, −1 is one below, +1 is one above. - `plt.spy(A)` visualizes the non-zero pattern — each dot represents a stored value. The banded structure is immediately visible. - For n=50, the matrix has 50² = 2500 possible entries but only 148 non-zeros — less than 6% density. At n=10000, a dense matrix would need 800 MB; the sparse version needs under 1 MB. ### Sparse Matrix Arithmetic Sparse matrices support the same arithmetic operators as NumPy arrays: `+`, `*` (element-wise or matrix multiply with `@`), and `.T` for transpose. Operations preserve sparsity where possible.
import scipy.sparse as sp
import numpy as np
n = 6
A = sp.diags([-np.ones(n-1), 2*np.ones(n), -np.ones(n-1)],
offsets=[-1, 0, 1], format="csr")
x = np.ones(n)
b = A @ x # sparse matrix-vector product
print("A @ ones =", b)
# Multiply two sparse matrices
B = A @ A
print(f"A² non-zeros: {B.nnz} (more fill-in than A with {A.nnz})")- `A @ x` works the same as for NumPy arrays but exploits sparsity — only non-zero entries are touched, so it's O(nnz) instead of O(n²). - Multiplying two sparse matrices (`A @ A`) can introduce fill-in: entries that were zero in both operands become non-zero in the product. The result may be denser than the inputs. - For the 1D Laplacian `A`, `A @ ones` gives `[1, 0, 0, ..., 0, 1]` — the boundary residuals of a finite-difference discretization. ### Solving a Sparse Linear System `spsolve` solves Ax = b for sparse A using a direct sparse LU factorization. It's dramatically faster than `numpy.linalg.solve` for large sparse systems because it exploits the zero structure.
import scipy.sparse as sp
import scipy.sparse.linalg as spla
import numpy as np
import matplotlib.pyplot as plt
# Solve the 1D Poisson equation: -u'' = f, u(0)=u(1)=0
n = 200
h = 1.0 / (n + 1)
x = np.linspace(h, 1 - h, n)
A = sp.diags([-np.ones(n-1), 2*np.ones(n), -np.ones(n-1)],
offsets=[-1, 0, 1], format="csr") / h**2
f = np.sin(np.pi * x) # right-hand side
u = spla.spsolve(A, f) # numerical solution
u_exact = np.sin(np.pi * x) / np.pi**2 # exact: u = sin(πx)/π²
plt.figure(figsize=(8, 4))
plt.plot(x, u_exact, color="steelblue", linewidth=2.5, label="Exact")
plt.plot(x, u, color="crimson", linewidth=1.5, linestyle="--", label="spsolve")
plt.title("Sparse Linear Solve: 1D Poisson Equation")
plt.xlabel("x")
plt.ylabel("u(x)")
plt.legend()
plt.tight_layout()
plt.show()
print(f"Max error: {np.max(np.abs(u - u_exact)):.2e}")- `spla.spsolve(A, f)` solves the 200×200 tridiagonal system in milliseconds; the same call on a dense matrix would be slower by orders of magnitude at larger n. - The right-hand side `f = sin(πx)` has the exact solution `u = sin(πx)/π²` — a clean benchmark to verify accuracy. - `max_error` is O(h²), confirming the expected second-order accuracy of the finite-difference scheme. Sparse matrices are essential whenever the problem size exceeds what dense methods can handle. For dense linear algebra — decompositions, eigenvalues, matrix functions — see [linear algebra with scipy.linalg](/tutorials/linear-algebra-with-scipy-linalg).