Simulating an Elasticsearch ingest pipeline reveals how processors transform a document before indexing, helping catch grok mismatches, date parsing issues, and unexpected field names before they turn into mapping errors.
Ingest pipelines run on ingest-capable nodes and apply a sequence of processors (such as grok, set, rename, and date) to each incoming document. The _simulate API sends one or more sample documents through a stored pipeline and returns the transformed documents as JSON without writing anything to an index.
Simulation output shows the post-processor document, not the final indexed result, so mapping and analysis are not applied. Many environments also enable TLS and authentication, so the same request may require HTTPS, credentials, and a CA certificate depending on cluster security settings.
Steps to simulate an Elasticsearch ingest pipeline:
- Run a basic simulation against the pipeline ID with a sample document.
$ curl -s -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-logs/_simulate?filter_path=docs.doc._source&pretty" -d '{ "docs": [ { "_source": { "message": "2025-01-22T10:15:00Z INFO app started" } } ] }' { "docs" : [ { "doc" : { "_source" : { "msg" : "app started", "message" : "2025-01-22T10:15:00Z INFO app started", "level" : "INFO", "timestamp" : "2025-01-22T10:15:00Z" } } } ] }Secured clusters typically require https:// endpoints, authentication (basic auth or API key), and `–cacert` when using a private CA.
- Simulate multiple documents in one request to compare edge cases.
$ curl -s -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-logs/_simulate?filter_path=docs.doc._source&pretty" -d '{ "docs": [ { "_source": { "message": "2025-01-22T10:15:00Z INFO app started" } }, { "_source": { "message": "2025-01-22T10:16:12Z WARN cache miss" } } ] }' { "docs" : [ { "doc" : { "_source" : { "msg" : "app started", "message" : "2025-01-22T10:15:00Z INFO app started", "level" : "INFO", "timestamp" : "2025-01-22T10:15:00Z" } } }, { "doc" : { "_source" : { "msg" : "cache miss", "message" : "2025-01-22T10:16:12Z WARN cache miss", "level" : "WARN", "timestamp" : "2025-01-22T10:16:12Z" } } } ] } - Run the simulation with verbose processor output to show intermediate results for each processor.
$ curl -s -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-logs/_simulate?verbose=true&filter_path=docs.processor_results.processor_type,docs.processor_results.status,docs.processor_results.doc._source&pretty" -d '{ "docs": [ { "_source": { "message": "2025-01-22T10:15:00Z INFO app started" } } ] }' { "docs" : [ { "processor_results" : [ { "processor_type" : "grok", "status" : "success", "doc" : { "_source" : { "msg" : "app started", "message" : "2025-01-22T10:15:00Z INFO app started", "level" : "INFO", "timestamp" : "2025-01-22T10:15:00Z" } } } ] } ] }Verbose output highlights the processor responsible for each change and isolates the first failing processor when a document errors.
- Fetch the pipeline definition before adjusting processors.
$ curl -s "http://localhost:9200/_ingest/pipeline/parse-logs?pretty" { "parse-logs" : { "description" : "Parse simple timestamp/level/message log lines", "processors" : [ { "grok" : { "field" : "message", "patterns" : [ "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:msg}" ] } } ] } }
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
