Sparse arrays often need to move from one Python run to another without becoming dense arrays on disk. SciPy can write supported sparse formats to a .npz archive, which keeps the sparse structure and stored values together for later loading.
The scipy.sparse.save_npz() helper accepts sparse arrays and matrices in CSR, CSC, BSR, DIA, and COO storage formats. Convert editable construction formats such as LIL or DOK before saving so the archive stores one of the formats that load_npz() can restore directly.
A round trip should prove more than file creation. Check the loaded object type, sparse format, shape, stored-entry count, and dense-equivalent values before reusing the archive in a feature pipeline, graph workflow, or simulation checkpoint.
Steps to save and load a SciPy sparse array:
- Create a script named sparse_save_load_demo.py.
- sparse_save_load_demo.py
import numpy as np from scipy import sparse rows = np.array([0, 0, 1, 2, 2]) cols = np.array([0, 3, 1, 0, 2]) values = np.array([10.0, 2.5, 3.0, 4.5, 8.0]) features = sparse.coo_array((values, (rows, cols)), shape=(3, 4)).tocsr() sparse.save_npz("feature_matrix.npz", features) loaded = sparse.load_npz("feature_matrix.npz") print(f"saved format: {features.format}") print(f"loaded type: {type(loaded).__name__}") print(f"loaded format: {loaded.format}") print(f"shape: {loaded.shape}") print(f"stored values: {loaded.nnz}") print("dense rows:") for row in loaded.toarray(): print(row) print("same sparse class:", type(loaded) is type(features)) print("same dense values:", np.array_equal(loaded.toarray(), features.toarray()))
The sample builds with coo_array() and saves after tocsr() so the archive stores a row-oriented sparse array.
- Run the script.
$ python3 sparse_save_load_demo.py saved format: csr loaded type: csr_array loaded format: csr shape: (3, 4) stored values: 5 dense rows: [10. 0. 0. 2.5] [0. 3. 0. 0.] [4.5 0. 8. 0. ] same sparse class: True same dense values: True
- Confirm that the loaded object stays sparse.
csr_array and csr show that load_npz() restored a compressed sparse row array instead of a dense NumPy array. The shape and stored-value count should match the object saved by save_npz().
- Check the value round trip before handing the archive to later code.
same dense values: True confirms that converting both objects to dense arrays for this small test gives the same values. Use this kind of dense comparison only on sample-sized sparse arrays that fit in memory.
- Convert older sparse matrix archives when array semantics are required.
loaded = sparse.csr_array(sparse.load_npz("feature_matrix.npz"))
Older projects may still save csr_matrix or another sparse matrix class. Load the archive first, then convert when downstream code expects sparse arrays.
- Remove the demo script and archive when they were only created for the check.
$ rm sparse_save_load_demo.py feature_matrix.npz
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.