How to create a spectrogram with SciPy

A spectrogram turns a time-domain signal into a frequency-by-time view, which makes changing tones visible in audio, vibration, and sensor measurements. SciPy can calculate the power matrix with ShortTimeFFT.spectrogram(), while Matplotlib can save the plot for inspection or reports.

For new SciPy code, use ShortTimeFFT instead of the legacy scipy.signal.spectrogram() function. The ShortTimeFFT object stores the sample rate, window length, overlap, and scaling once, then reuses those settings for the frequency bins, time slices, and spectrogram values.

A synthetic two-second signal that changes from 50 Hz to 150 Hz makes the output easy to verify. The script checks the spectrogram dimensions, confirms the dominant frequency before and after the transition, and writes spectrogram.png for visual review.

Steps to create a spectrogram with SciPy:

  1. Create a script named spectrogram_create.py that builds a two-tone signal and saves the spectrogram plot.
    spectrogram_create.py
    import numpy as np
    import matplotlib.pyplot as plt
    from scipy.signal import ShortTimeFFT
     
    sample_rate = 800.0
    duration = 2.0
    sample_count = int(sample_rate * duration)
    t = np.arange(sample_count) / sample_rate
     
    signal = np.where(
        t < 1.0,
        np.sin(2 * np.pi * 50 * t),
        np.sin(2 * np.pi * 150 * t),
    )
     
    nperseg = 128
    noverlap = 96
     
    sft = ShortTimeFFT.from_window(
        "hann",
        fs=sample_rate,
        nperseg=nperseg,
        noverlap=noverlap,
        scale_to="psd",
    )
     
    power = sft.spectrogram(signal)
    times = sft.t(signal.size)
    freqs = sft.f
     
    early_columns = (times >= 0.1) & (times < 0.8)
    late_columns = (times > 1.2) & (times <= 1.9)
    early_profile = power[:, early_columns].mean(axis=1)
    late_profile = power[:, late_columns].mean(axis=1)
     
    power_db = 10 * np.log10(np.maximum(power, 1e-12))
     
    fig, ax = plt.subplots(figsize=(7, 4))
    mesh = ax.pcolormesh(times, freqs, power_db, shading="auto")
    ax.set_title("SciPy ShortTimeFFT spectrogram")
    ax.set_xlabel("Time (s)")
    ax.set_ylabel("Frequency (Hz)")
    ax.set_ylim(0, 220)
    fig.colorbar(mesh, ax=ax, label="PSD (dB)")
    fig.tight_layout()
    fig.savefig("spectrogram.png", dpi=150)
    plt.close(fig)
     
    print(f"signal samples: {signal.size}")
    print(f"frequency bins: {freqs.size}")
    print(f"time slices: {times.size}")
    print(f"spectrogram shape: {power.shape}")
    print(f"dominant early frequency: {freqs[np.argmax(early_profile)]:.1f} Hz")
    print(f"dominant late frequency: {freqs[np.argmax(late_profile)]:.1f} Hz")
    print("saved plot: spectrogram.png")

    nperseg sets the window length in samples, and noverlap controls how many samples adjacent windows share. scale_to=“psd” scales each column as a power spectral density before spectrogram() squares the short-time Fourier transform.

  2. Run the script and confirm the time-frequency output.
    $ python3 spectrogram_create.py
    signal samples: 1600
    frequency bins: 65
    time slices: 53
    spectrogram shape: (65, 53)
    dominant early frequency: 50.0 Hz
    dominant late frequency: 150.0 Hz
    saved plot: spectrogram.png

    The first axis of power contains frequency bins, and the last axis contains time slices. The dominant-frequency checks show the 50 Hz first half and the 150 Hz second half in the calculated spectrogram.

  3. Replace the synthetic signal with measured samples from the project.
    sample_rate = 48000.0
    signal = samples.astype(float)

    Use one channel for a one-dimensional spectrogram. For multichannel arrays, set sample_axis and pass it with sft.spectrogram(samples, axis=sample_axis) so the time dimension is transformed.

  4. Tune the window length and overlap for the signal being inspected.
    nperseg = 1024
    noverlap = 768
    sft = ShortTimeFFT.from_window(
        "hann",
        fs=sample_rate,
        nperseg=nperseg,
        noverlap=noverlap,
        scale_to="psd",
    )

    A longer window separates nearby frequencies more clearly but smears quick changes in time. More overlap creates more time slices, but it does not add new samples to the signal.

  5. Keep the decibel conversion for plots with a wide power range.
    power_db = 10 * np.log10(np.maximum(power, 1e-12))

    The np.maximum() guard prevents log10(0) from producing infinite values when a time-frequency bin has zero power.

  6. Remove the sample files after copying the pattern into the project.
    $ rm spectrogram_create.py spectrogram.png