Optimizing a Logstash pipeline keeps events moving when ingest volume rises, filters become expensive, or downstream outputs slow down. Better tuning reduces queue growth, shortens end-to-end latency, and makes backlogs easier to clear before upstream shippers start buffering aggressively or dropping data.

Current Logstash releases expose the main tuning signals through the monitoring API: worker and batch settings, flow metrics, queue state, and plugin-level worker cost. Those measurements make it possible to tell whether the real limit is CPU-bound filter work, I/O-bound outputs, undersized batching, or a queue that is absorbing downstream delay.

Performance tuning is workload-specific, so the safest path is to measure a baseline, change one variable, and compare the same metrics under similar traffic before keeping the change. Examples assume a package-based Logstash installation on Linux with systemd, where pipeline settings can live in /etc/logstash/logstash.yml or /etc/logstash/pipelines.yml, while queue and JVM changes still require a service restart.

Steps to optimize Logstash pipeline performance:

  1. Open a terminal on the Logstash host.
  2. Capture the current pipeline settings and a tuning baseline before changing anything.
    $ curl -s 'http://localhost:9600/?pretty'
    {
      "version" : "9.3.2",
      "name" : "perf-lab",
      "status" : "green",
      "pipeline" : {
        "workers" : 2,
        "batch_size" : 125,
        "batch_delay" : 50
      }
    }
    
    $ curl -s 'http://localhost:9600/_node/stats/pipelines/main?filter_path=pipelines.main.flow,pipelines.main.queue,pipelines.main.plugins.filters.id,pipelines.main.plugins.filters.flow.worker_utilization,pipelines.main.plugins.outputs.id,pipelines.main.plugins.outputs.flow.worker_utilization&pretty'
    {
      "pipelines" : {
        "main" : {
          "flow" : {
            "worker_utilization" : {
              "current" : 18.47
            },
            "queue_backpressure" : {
              "current" : 0.1954
            }
          },
          "plugins" : {
            "filters" : [ {
              "id" : "dissect_checkout",
              "flow" : {
                "worker_utilization" : {
                  "current" : 16.95
                }
              }
            } ],
            "outputs" : [ {
              "id" : "drop_output",
              "flow" : {
                "worker_utilization" : {
                  "current" : 0.67
                }
              }
            } ]
          },
          "queue" : {
            "type" : "memory",
            "events_count" : 0
          }
        }
      }
    }

    Current settings still default api.enabled to true, bind the API to the local loopback address unless it is secured differently, and use the 9600-9700 port range. When the API is protected with TLS or basic authentication, switch the URL, protocol, and credentials to match /etc/logstash/logstash.yml.

  3. Decide what kind of bottleneck the numbers show before changing worker or queue settings.

    When worker_utilization approaches 100, the pipeline is saturated and the filters or outputs are keeping all workers busy. When queue_backpressure stays above zero or a persistent queue keeps growing, inputs are being blocked by downstream work. A pipeline with low worker_utilization and flat throughput is usually underfed rather than under-tuned.

    Compare plugin-level worker_utilization and worker_millis_per_event to find the specific filter or output that is consuming most of the pipeline capacity.

  4. Increase pipeline.workers and pipeline.batch.size only for the pipeline that needs more capacity.
    - pipeline.id: main
      path.config: "/etc/logstash/conf.d/*.conf"
      pipeline.workers: 4
      pipeline.batch.size: 200
      pipeline.batch.delay: 50

    Use pipeline.workers to expose more filter and output concurrency when the host still has CPU headroom. Larger pipeline.batch.size values can improve throughput for I/O-heavy outputs because each worker sends larger batches, but they also raise memory use and per-event wait time.

    The inflight count is the product of pipeline.workers and pipeline.batch.size. Doubling both settings quadruples the maximum number of in-memory events, so raise them in small steps and leave heap headroom available.

    On single-pipeline hosts, the same keys can be set globally in /etc/logstash/logstash.yml instead of /etc/logstash/pipelines.yml.

  5. Check whether the larger batch size is actually being filled.
    $ curl -s 'http://localhost:9600/_node/stats/pipelines/main?filter_path=pipelines.main.pipeline,pipelines.main.batch&pretty'
    {
      "pipelines" : {
        "main" : {
          "pipeline" : {
            "workers" : 4,
            "batch_size" : 200,
            "batch_delay" : 50
          },
          "batch" : {
            "event_count" : {
              "average" : {
                "lifetime" : 200
              }
            },
            "byte_size" : {
              "average" : {
                "lifetime" : 29800
              }
            }
          }
        }
      }
    }

    If the average batch event count stays far below the configured pipeline.batch.size, raising the batch limit further usually adds memory pressure without improving throughput.

    Current releases sample batch metrics with pipeline.batch.metrics.sampling_mode, which defaults to minimal. Switch it to full only when more detailed sampling is worth the extra measurement cost.

  6. Reduce hot plugin cost before adding more workers.
    filter {
      if [event][dataset] == "app.access" {
        dissect {
          id => "dissect_app_access"
          mapping => {
            "message" => "%{ts} %{[log][level]} %{[service][name]} %{msg}"
          }
        }
      }
    }

    Use plugin id values so the monitoring API can identify the exact filter or output instance that is expensive. Fixed-format text is usually faster through dissect than grok, and conditionals keep heavy filters from running on every event.

  7. Move buffering to a persistent queue only when crash recovery or downstream outages matter more than raw throughput.
    queue.type: persisted
    queue.max_bytes: 8gb

    Memory queues are generally faster. Persistent queues trade some throughput for disk-backed buffering and durability when Elasticsearch, Kafka, or another output slows down.

    A persistent queue can become the new bottleneck if the disk is slow or the filesystem under path.queue runs out of space. A smaller queue.max_bytes can improve queue performance when only short buffering is required.

  8. Raise JVM heap only when the larger inflight count or plugin workload needs more memory.
    -Xms4g
    -Xmx4g

    Current Elastic JVM guidance recommends keeping -Xms and -Xmx equal and, for typical ingestion hosts, sizing heap in the 4 GB to 8 GB range instead of leaving the package default unchanged.

    Do not size heap past the host's physical memory. Leave room for the operating system, direct memory used by network inputs, and persistent-queue page cache when queue.type: persisted is enabled.

  9. Validate the configuration before applying the tuning changes.
    $ sudo -u logstash /usr/share/logstash/bin/logstash --path.settings /etc/logstash --path.data /tmp/logstash-configtest --config.test_and_exit
    Using bundled JDK: /usr/share/logstash/jdk
    Configuration OK
    [2026-04-08T00:38:54,046][INFO ][logstash.runner          ] Using config.test_and_exit mode. Config Validation Result: OK. Exiting Logstash

    Run the check as the logstash service account so the command uses the same permissions model as the packaged service and a temporary --path.data directory instead of the live state under /var/lib/logstash.

    Current Logstash releases block superuser runs by default, so the same check fails as root unless allow_superuser was intentionally changed.

    Changes in /etc/logstash/logstash.yml, /etc/logstash/pipelines.yml, or jvm.options still require a full service restart even when automatic pipeline reload is enabled.

  10. Restart the Logstash service so the new settings take effect.
    $ sudo systemctl restart logstash.service
    $ sudo systemctl status logstash.service --no-pager --lines=0
    ● logstash.service - logstash
         Loaded: loaded (/usr/lib/systemd/system/logstash.service; enabled; preset: enabled)
         Active: active (running) since Tue 2026-04-08 00:47:18 UTC; 6s ago

    Restarting the service pauses every active pipeline once. On busy production hosts, make the change during a controlled window or after confirming that upstream senders can buffer.

  11. Measure the same API metrics again under comparable traffic and keep only the changes that improve the original bottleneck.
    $ curl -s 'http://localhost:9600/_node/stats/pipelines/main?filter_path=pipelines.main.flow,pipelines.main.batch&pretty'
    {
      "pipelines" : {
        "main" : {
          "flow" : {
            "worker_utilization" : {
              "current" : 6.566
            },
            "queue_backpressure" : {
              "current" : 0.05706
            }
          },
          "batch" : {
            "event_count" : {
              "average" : {
                "lifetime" : 200
              }
            }
          }
        }
      }
    }

    Compare like with like: the same traffic shape, the same downstream cluster state, and only one changed tuning variable at a time. A lower queue_backpressure, a fuller average batch, or a lower plugin worker_millis_per_event is meaningful only if the workload is similar.

    If the output plugin still dominates worker time after worker and batch tuning, the real limit is usually downstream index throughput, storage latency, or network round-trip time rather than the Logstash filter chain.