How to compute a bootstrap confidence interval with SciPy

Bootstrap confidence intervals estimate how much a sample statistic could move if the same measurement process were repeated. SciPy provides scipy.stats.bootstrap() for this job, which is useful when a mean, median, correlation, or custom statistic needs interval bounds without relying only on a normal approximation.

The bootstrap() function receives samples as a sequence of arrays and a statistic callable. For one sample, the data argument still needs a one-item tuple, such as (latency_ms,), so SciPy can use the same interface for one-sample and multi-sample statistics.

Use rng in new code to make the resampling repeatable. The function returns a BootstrapResult object with confidence_interval, bootstrap_distribution, and standard_error attributes, so the same result object can feed printed reports or downstream Python code.

Steps to compute a bootstrap confidence interval with SciPy:

  1. Create a Python script that computes the interval for a sample mean.
    bootstrap_ci.py
    import numpy as np
    from scipy.stats import bootstrap
     
    latency_ms = np.array([
        117, 121, 113, 125,
        119, 128, 115, 122,
        120, 118, 124, 116,
    ])
    rng = np.random.default_rng(20260625)
     
     
    def mean_latency(sample, axis):
        return np.mean(sample, axis=axis)
     
     
    result = bootstrap(
        (latency_ms,),
        mean_latency,
        confidence_level=0.95,
        n_resamples=9999,
        method="BCa",
        rng=rng,
    )
     
    ci = result.confidence_interval
    sample_mean = np.mean(latency_ms)
     
    print(f"sample mean: {sample_mean:.2f} ms")
    print(f"95% BCa confidence interval: {ci.low:.2f} to {ci.high:.2f} ms")
    print(f"bootstrap standard error: {result.standard_error:.2f} ms")
    print(f"resamples: {result.bootstrap_distribution.shape[-1]}")
    print(f"sample mean inside interval: {ci.low < sample_mean < ci.high}")

    The statistic function accepts axis so bootstrap() can evaluate resamples in vectorized batches instead of calling the function once per resample.

  2. Run the script to verify the interval fields and resample count.
    $ python bootstrap_ci.py
    sample mean: 119.83 ms
    95% BCa confidence interval: 117.58 to 122.25 ms
    bootstrap standard error: 1.20 ms
    resamples: 9999
    sample mean inside interval: True
  3. Read the interval bounds from the confidence_interval attribute when another script needs numeric values.
    lower = result.confidence_interval.low
    upper = result.confidence_interval.high

    bootstrap_distribution.shape[-1] should match the requested n_resamples value unless an existing bootstrap_result was reused to add more resamples.

  4. Change only the statistic function when the interval should cover another statistic from the same sample.
    def median_latency(sample, axis):
        return np.median(sample, axis=axis)

    BCa intervals can return nan bounds when the bootstrap distribution is degenerate, such as a sample where every observation produces the same statistic.