Simulating an Elasticsearch ingest pipeline reveals how processors transform a document before indexing, helping catch grok mismatches, date parsing issues, and unexpected field names before they turn into mapping errors.

Ingest pipelines run on ingest-capable nodes and apply a sequence of processors (such as grok, set, rename, and date) to each incoming document. The _simulate API sends one or more sample documents through a stored pipeline and returns the transformed documents as JSON without writing anything to an index.

Simulation output shows the post-processor document, not the final indexed result, so mapping and analysis are not applied. Many environments also enable TLS and authentication, so the same request may require HTTPS, credentials, and a CA certificate depending on cluster security settings.

Steps to simulate an Elasticsearch ingest pipeline:

  1. Run a basic simulation against the pipeline ID with a sample document.
    $ curl -s -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-logs/_simulate?filter_path=docs.doc._source&pretty" -d '{
      "docs": [
        { "_source": { "message": "2025-01-22T10:15:00Z INFO app started" } }
      ]
    }'
    {
      "docs" : [
        {
          "doc" : {
            "_source" : {
              "msg" : "app started",
              "message" : "2025-01-22T10:15:00Z INFO app started",
              "level" : "INFO",
              "timestamp" : "2025-01-22T10:15:00Z"
            }
          }
        }
      ]
    }

    Secured clusters typically require https:// endpoints, authentication (basic auth or API key), and `–cacert` when using a private CA.

  2. Simulate multiple documents in one request to compare edge cases.
    $ curl -s -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-logs/_simulate?filter_path=docs.doc._source&pretty" -d '{
      "docs": [
        { "_source": { "message": "2025-01-22T10:15:00Z INFO app started" } },
        { "_source": { "message": "2025-01-22T10:16:12Z WARN cache miss" } }
      ]
    }'
    {
      "docs" : [
        {
          "doc" : {
            "_source" : {
              "msg" : "app started",
              "message" : "2025-01-22T10:15:00Z INFO app started",
              "level" : "INFO",
              "timestamp" : "2025-01-22T10:15:00Z"
            }
          }
        },
        {
          "doc" : {
            "_source" : {
              "msg" : "cache miss",
              "message" : "2025-01-22T10:16:12Z WARN cache miss",
              "level" : "WARN",
              "timestamp" : "2025-01-22T10:16:12Z"
            }
          }
        }
      ]
    }
  3. Run the simulation with verbose processor output to show intermediate results for each processor.
    $ curl -s -H "Content-Type: application/json" -X POST "http://localhost:9200/_ingest/pipeline/parse-logs/_simulate?verbose=true&filter_path=docs.processor_results.processor_type,docs.processor_results.status,docs.processor_results.doc._source&pretty" -d '{
      "docs": [
        { "_source": { "message": "2025-01-22T10:15:00Z INFO app started" } }
      ]
    }'
    {
      "docs" : [
        {
          "processor_results" : [
            {
              "processor_type" : "grok",
              "status" : "success",
              "doc" : {
                "_source" : {
                  "msg" : "app started",
                  "message" : "2025-01-22T10:15:00Z INFO app started",
                  "level" : "INFO",
                  "timestamp" : "2025-01-22T10:15:00Z"
                }
              }
            }
          ]
        }
      ]
    }

    Verbose output highlights the processor responsible for each change and isolates the first failing processor when a document errors.

  4. Fetch the pipeline definition before adjusting processors.
    $ curl -s "http://localhost:9200/_ingest/pipeline/parse-logs?pretty"
    {
      "parse-logs" : {
        "description" : "Parse simple timestamp/level/message log lines",
        "processors" : [
          {
            "grok" : {
              "field" : "message",
              "patterns" : [
                "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:msg}"
              ]
            }
          }
        ]
      }
    }