How to interpolate scattered data with SciPy

Irregular point measurements often need values at coordinates where no sensor, survey, or simulation sample exists. SciPy can estimate those missing values from unstructured coordinate pairs without forcing the samples onto a rectangular grid first.

scipy.interpolate.griddata() accepts an n by D coordinate array, one value per point, and a separate set of query coordinates. The linear method triangulates the scattered points and interpolates inside each simplex, while nearest returns the closest sample value when a nearest-sample answer is acceptable.

The important boundary is the convex hull of the measured points. Linear and cubic scattered interpolation return NaN outside that hull by default or a chosen fill_value, while nearest-neighbor interpolation does not use fill_value because it can always choose the closest input point.

Steps to interpolate scattered data with SciPy:

Create a Python script named scattered_data_interpolate.py.

scattered_data_interpolate.py

import numpy as np
from scipy.interpolate import LinearNDInterpolator, griddata
 
np.set_printoptions(precision=2, suppress=True)
 
points = np.array([
    [0.0, 0.0],
    [1.0, 0.0],
    [0.0, 1.0],
    [1.0, 1.0],
    [0.5, 0.2],
])
values = points[:, 0] + 2 * points[:, 1]
 
query_points = np.array([
    [0.25, 0.25],
    [0.75, 0.50],
    [1.20, 0.40],
])
 
linear = griddata(points, values, query_points, method="linear")
nearest = griddata(points, values, query_points, method="nearest")
with_fill = griddata(points, values, query_points, method="linear", fill_value=-1.0)
 
reusable_linear = LinearNDInterpolator(points, values)
inside_expected = query_points[:2, 0] + 2 * query_points[:2, 1]
 
print("query points:")
for row in query_points:
    print(f"  ({row[0]:.2f}, {row[1]:.2f})")
print("linear griddata:", np.round(linear, 2))
print("nearest griddata:", np.round(nearest, 2))
print("linear fill_value:", np.round(with_fill, 2))
print("callable linear:", np.round(reusable_linear(query_points[:2]), 2))
print("expected inside:", np.round(inside_expected, 2))
print("inside match:", np.allclose(linear[:2], inside_expected))
print("outside is nan:", np.isnan(linear[2]))

The sample values come from z = x + 2*y, so the first two interpolated query values are easy to check.

Run the script.

$ python3 scattered_data_interpolate.py
query points:
  (0.25, 0.25)
  (0.75, 0.50)
  (1.20, 0.40)
linear griddata: [0.75 1.75  nan]
nearest griddata: [0.9 0.9 1. ]
linear fill_value: [ 0.75  1.75 -1.  ]
callable linear: [0.75 1.75]
expected inside: [0.75 1.75]
inside match: True
outside is nan: True

Keep points as one coordinate row per measured sample.

points = np.array([
    [0.0, 0.0],
    [1.0, 0.0],
    [0.0, 1.0],
    [1.0, 1.0],
    [0.5, 0.2],
])
values = points[:, 0] + 2 * points[:, 1]

The values array must have one value for each row in points.

Pass query coordinates as another two-column array.
```
query_points = np.array([
    [0.25, 0.25],
    [0.75, 0.50],
    [1.20, 0.40],
])
```
Each query row uses the same coordinate order as points.
Use method=“linear” for interpolation inside the measured hull.
```
linear = griddata(points, values, query_points, method="linear")
```
The first two query points are inside the measured square, so their values match z = x + 2*y.
Set fill_value when outside-hull linear results need a sentinel value.
```
with_fill = griddata(points, values, query_points, method="linear", fill_value=-1.0)
```
The third query point has an x coordinate greater than the measured range, so linear interpolation returns NaN by default or -1.0 with the explicit fill value.
Use method=“nearest” only when reusing the closest sample value is acceptable.
```
nearest = griddata(points, values, query_points, method="nearest")
```
Nearest-neighbor interpolation can return a value outside the measured hull. It does not prove that a query point is surrounded by samples.
Create a reusable interpolator for repeated linear queries.
```
reusable_linear = LinearNDInterpolator(points, values)
reusable_linear(query_points[:2])
```
griddata() is a convenience function. LinearNDInterpolator keeps the interpolator object available when the same scattered dataset will be queried more than once.
Remove the demo script when it was only created for the check.
```
$ rm scattered_data_interpolate.py
```