How to simulate an Elasticsearch ingest pipeline

Simulating an Elasticsearch ingest pipeline reveals how processors transform a document before indexing, helping catch grok mismatches, date parsing issues, and unexpected field names before they turn into mapping errors.

Ingest pipelines run on ingest-capable nodes and apply a sequence of processors (such as grok, set, rename, and date) to each incoming document. The _simulate API sends one or more sample documents through a stored pipeline and returns the transformed documents as JSON without writing anything to an index.

Simulation output shows the post-processor document, not the final indexed result, so mapping and analysis are not applied. Many environments also enable TLS and authentication, so the same request may require HTTPS, credentials, and a CA certificate depending on cluster security settings.

Steps to simulate an Elasticsearch ingest pipeline:

Run a basic simulation against the pipeline ID with a sample document.

$ curl -s -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-logs/_simulate?filter_path=docs.doc._source&pretty" -d '{
  "docs": [
    { "_source": { "message": "2025-01-22T10:15:00Z INFO app started" } }
  ]
}'
{
  "docs" : [
    {
      "doc" : {
        "_source" : {
          "msg" : "app started",
          "message" : "2025-01-22T10:15:00Z INFO app started",
          "level" : "INFO",
          "timestamp" : "2025-01-22T10:15:00Z"
        }
      }
    }
  ]
}

Secured clusters typically require https:// endpoints, authentication (basic auth or API key), and `–cacert` when using a private CA.

Simulate multiple documents in one request to compare edge cases.

$ curl -s -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-logs/_simulate?filter_path=docs.doc._source&pretty" -d '{
  "docs": [
    { "_source": { "message": "2025-01-22T10:15:00Z INFO app started" } },
    { "_source": { "message": "2025-01-22T10:16:12Z WARN cache miss" } }
  ]
}'
{
  "docs" : [
    {
      "doc" : {
        "_source" : {
          "msg" : "app started",
          "message" : "2025-01-22T10:15:00Z INFO app started",
          "level" : "INFO",
          "timestamp" : "2025-01-22T10:15:00Z"
        }
      }
    },
    {
      "doc" : {
        "_source" : {
          "msg" : "cache miss",
          "message" : "2025-01-22T10:16:12Z WARN cache miss",
          "level" : "WARN",
          "timestamp" : "2025-01-22T10:16:12Z"
        }
      }
    }
  ]
}

Run the simulation with verbose processor output to show intermediate results for each processor.

$ curl -s -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-logs/_simulate?verbose=true&filter_path=docs.processor_results.processor_type,docs.processor_results.status,docs.processor_results.doc._source&pretty" -d '{
  "docs": [
    { "_source": { "message": "2025-01-22T10:15:00Z INFO app started" } }
  ]
}'
{
  "docs" : [
    {
      "processor_results" : [
        {
          "processor_type" : "grok",
          "status" : "success",
          "doc" : {
            "_source" : {
              "msg" : "app started",
              "message" : "2025-01-22T10:15:00Z INFO app started",
              "level" : "INFO",
              "timestamp" : "2025-01-22T10:15:00Z"
            }
          }
        }
      ]
    }
  ]
}

Verbose output highlights the processor responsible for each change and isolates the first failing processor when a document errors.

Fetch the pipeline definition before adjusting processors.

$ curl -s "http://localhost:9200/_ingest/pipeline/parse-logs?pretty"
{
  "parse-logs" : {
    "description" : "Parse simple timestamp/level/message log lines",
    "processors" : [
      {
        "grok" : {
          "field" : "message",
          "patterns" : [
            "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:msg}"
          ]
        }
      }
    ]
  }
}

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.