Simulating an Elasticsearch ingest pipeline shows how each processor rewrites a document before it is indexed, which is the safest way to catch broken grok patterns, wrong field names, and unexpected output before live traffic starts failing.
Ingest pipelines are stored in cluster state and run on ingest-capable nodes during indexing. The _ingest/pipeline/<id>/_simulate endpoint tests an existing pipeline by ID, while _ingest/pipeline/_simulate accepts a pipeline definition in the request body so draft processor changes can be tested without saving them first.
Simulation returns the post-processor document, not the final indexed result, so mappings, analyzers, and index templates are not applied. Current secured clusters typically use an authenticated HTTPS endpoint for simulate requests, together with the same reader-facing credentials or API key operators normally use.
$ curl -sS "http://localhost:9200/_ingest/pipeline/parse-app-logs?filter_path=*.description,*.processors&pretty"
{
"parse-app-logs" : {
"description" : "Parse application log lines and normalize level",
"processors" : [
{
"grok" : {
"field" : "message",
"patterns" : [
"%{TIMESTAMP_ISO8601:@timestamp} %{LOGLEVEL:level} %{GREEDYDATA:event.original}"
]
}
},
{
"lowercase" : {
"field" : "level"
}
},
{
"set" : {
"field" : "service.name",
"value" : "payments-api"
}
}
]
}
}
Fetching the stored pipeline first confirms the pipeline ID, processor order, and field names that the simulation response should reflect.
$ curl -sS -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-app-logs/_simulate?filter_path=docs.doc._source&pretty" -d '{
"docs": [
{ "_source": { "message": "2026-04-02T08:15:00Z INFO card authorized" } }
]
}'
{
"docs" : [
{
"doc" : {
"_source" : {
"@timestamp" : "2026-04-02T08:15:00Z",
"message" : "2026-04-02T08:15:00Z INFO card authorized",
"event" : {
"original" : "card authorized"
},
"level" : "info",
"service" : {
"name" : "payments-api"
}
}
}
}
]
}
On secured clusters, replace http://localhost:9200 with the real HTTPS endpoint and add authentication such as --user username:password or -H "Authorization: ApiKey BASE64_KEY".
$ curl -sS -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-app-logs/_simulate?filter_path=docs.doc._source&pretty" -d '{
"docs": [
{ "_source": { "message": "2026-04-02T08:15:00Z INFO card authorized" } },
{ "_source": { "message": "2026-04-02T08:16:12Z WARN retry queued" } }
]
}'
{
"docs" : [
{
"doc" : {
"_source" : {
"@timestamp" : "2026-04-02T08:15:00Z",
"message" : "2026-04-02T08:15:00Z INFO card authorized",
"event" : {
"original" : "card authorized"
},
"level" : "info",
"service" : {
"name" : "payments-api"
}
}
}
},
{
"doc" : {
"_source" : {
"@timestamp" : "2026-04-02T08:16:12Z",
"message" : "2026-04-02T08:16:12Z WARN retry queued",
"event" : {
"original" : "retry queued"
},
"level" : "warn",
"service" : {
"name" : "payments-api"
}
}
}
}
]
}
Use the same request to test normal, borderline, and malformed payloads so processor behavior can be compared without attaching the pipeline to live indexing traffic.
$ curl -sS -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/_simulate?filter_path=docs.doc._source&pretty" -d '{
"pipeline": {
"description": "Draft pipeline test",
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"%{TIMESTAMP_ISO8601:@timestamp} %{LOGLEVEL:level} %{GREEDYDATA:event.original}"
]
}
},
{
"lowercase": {
"field": "level"
}
},
{
"set": {
"field": "labels.stage",
"value": "staging"
}
}
]
},
"docs": [
{ "_source": { "message": "2026-04-02T08:20:00Z ERROR db timeout" } }
]
}'
{
"docs" : [
{
"doc" : {
"_source" : {
"@timestamp" : "2026-04-02T08:20:00Z",
"message" : "2026-04-02T08:20:00Z ERROR db timeout",
"event" : {
"original" : "db timeout"
},
"level" : "error",
"labels" : {
"stage" : "staging"
}
}
}
}
]
}
The request-body pipeline object is useful for trying a processor change before storing it with PUT /_ingest/pipeline/<id>.
If the request also includes a pipeline ID in the URL path, Elasticsearch uses the path pipeline and ignores the body pipeline definition.
$ curl -sS -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-app-logs/_simulate?verbose=true&filter_path=docs.processor_results.processor_type,docs.processor_results.doc._source&pretty" -d '{
"docs": [
{ "_source": { "message": "2026-04-02T08:15:00Z INFO card authorized" } }
]
}'
{
"docs" : [
{
"processor_results" : [
{
"processor_type" : "grok",
"doc" : {
"_source" : {
"@timestamp" : "2026-04-02T08:15:00Z",
"message" : "2026-04-02T08:15:00Z INFO card authorized",
"event" : {
"original" : "card authorized"
},
"level" : "INFO"
}
}
},
{
"processor_type" : "lowercase",
"doc" : {
"_source" : {
"@timestamp" : "2026-04-02T08:15:00Z",
"message" : "2026-04-02T08:15:00Z INFO card authorized",
"event" : {
"original" : "card authorized"
},
"level" : "info"
}
}
},
{
"processor_type" : "set",
"doc" : {
"_source" : {
"@timestamp" : "2026-04-02T08:15:00Z",
"message" : "2026-04-02T08:15:00Z INFO card authorized",
"event" : {
"original" : "card authorized"
},
"level" : "info",
"service" : {
"name" : "payments-api"
}
}
}
}
]
}
]
}
Verbose output follows the processor order from the pipeline definition, which makes it easier to isolate the processor that changed a field unexpectedly or introduced a failure.