Boolean masks turn array comparisons into a selection rule that NumPy can apply without a Python loop. They are useful when the values kept from an array depend on thresholds, validity flags, row-level quality checks, or any condition that should stay tied to the data.
A mask is an array of True and False values. When the mask has the same shape as the target array, array[mask] returns the values whose positions are True. When a one-dimensional mask is applied to a two-dimensional array, it filters rows on the first axis and keeps the remaining columns for each selected row.
Combine comparisons with parenthesized expressions before using & for and or | for or. Python's and and or operators do not combine NumPy boolean arrays element by element, and a mask with the wrong length raises an indexing error instead of silently trimming the data.
Related: Filter NaN values
Related: Replace values conditionally
Related: Index and slice arrays
Steps to filter a NumPy array with a boolean mask:
- Create a script that filters individual values, filters table rows, and checks a mismatched mask.
- array-filter-boolean-mask.py
import numpy as np temperatures = np.array([18.5, 21.0, 26.5, 31.0, 24.0]) hot_day_mask = temperatures >= 24 hot_days = temperatures[hot_day_mask] samples = np.array( [ [18.5, 0.91], [26.5, 0.98], [31.0, 0.99], [21.0, 0.87], ] ) valid_hot_rows = (samples[:, 0] >= 24) & (samples[:, 1] >= 0.95) selected_rows = samples[valid_hot_rows] assert hot_day_mask.shape == temperatures.shape assert valid_hot_rows.shape == (samples.shape[0],) assert hot_days.tolist() == [26.5, 31.0, 24.0] assert selected_rows.shape == (2, 2) print("temperatures:", temperatures) print("hot day mask:", hot_day_mask) print("hot days:", hot_days) print("hot day count:", hot_day_mask.sum()) print("row mask shape:", valid_hot_rows.shape) print("selected rows:") for row in selected_rows: print(" ", row.tolist()) try: temperatures[np.array([True, False])] except IndexError as error: print("shape error:", error)
hot_day_mask has one value per temperature. valid_hot_rows has one value per row, so samples[valid_hot_rows] keeps complete rows.
- Run the script and confirm the selected values, row mask shape, and indexing error.
$ python3 array-filter-boolean-mask.py temperatures: [18.5 21. 26.5 31. 24. ] hot day mask: [False False True True True] hot days: [26.5 31. 24. ] hot day count: 3 row mask shape: (4,) selected rows: [26.5, 0.98] [31.0, 0.99] shape error: boolean index did not match indexed array along axis 0; size of axis is 5 but size of corresponding boolean axis is 2
The assertions stop the script if the value mask, row mask, filtered values, or selected row shape no longer match the intended filter.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.