Bulk indexing reduces HTTP overhead when a large batch of events, application records, or one-time imports must be written into Elasticsearch quickly. Grouping documents into one request keeps ingestion faster and more predictable than sending each document with a separate indexing call.
The Bulk API accepts NDJSON where each action line (such as index) is followed by the source document line for that action. A single request can write many documents, and the response returns item-level statuses so successful and failed writes can be checked separately instead of assuming the whole batch behaved the same way.
Current self-managed clusters typically use an authenticated HTTPS endpoint for these API calls. When the next action is a search, refresh=wait_for is preferred over forcing an immediate refresh, and stock 9.x nodes can reserve patterns such as logs-* for built-in data stream templates, so use an application-specific index name unless the target is intentionally a data stream.
Steps to bulk index documents into Elasticsearch:
- Create a target index with mappings that match the bulk payload.
$ curl -sS -H "Content-Type: application/json" -X PUT "http://localhost:9200/app-events-bulk-2026.01?pretty" -d '{ "settings": { "number_of_shards": 1, "number_of_replicas": 0 }, "mappings": { "properties": { "timestamp": { "type": "date" }, "level": { "type": "keyword" }, "message": { "type": "text" } } } }' { "acknowledged" : true, "shards_acknowledged" : true, "index" : "app-events-bulk-2026.01" }Stock Elasticsearch nodes can reject plain indices named logs-* because built-in templates reserve those patterns for data streams. Use an application-specific index name unless the target is intentionally a data stream.
- Create a bulk request file in NDJSON format.
$ cat > bulk.ndjson <<'BULK' { "index": { "_id": "evt-1001" } } { "timestamp": "2026-04-02T06:15:00Z", "level": "INFO", "message": "service started" } { "index": { "_id": "evt-1002" } } { "timestamp": "2026-04-02T06:16:12Z", "level": "ERROR", "message": "connection timeout" } BULKBecause the target index is already in the request path, each action line only needs the operation and optional document ID. Each JSON object must be a single line, and the file must end with a trailing newline for the bulk parser to accept the final action.
- Submit the bulk request to Elasticsearch.
$ curl -sS -H "Content-Type: application/x-ndjson" -X POST "http://localhost:9200/app-events-bulk-2026.01/_bulk?refresh=wait_for&filter_path=took,errors,items.*.status,items.*.result&pretty" --data-binary @bulk.ndjson { "errors" : false, "took" : 1022, "items" : [ { "index" : { "result" : "created", "status" : 201 } }, { "index" : { "result" : "created", "status" : 201 } } ] }Use --data-binary so curl preserves newlines exactly. Current Elasticsearch accepts either application/json or application/x-ndjson for Bulk API requests, and an HTTP 200 response can still contain failed items, so inspect errors and each item status before assuming the batch succeeded.
- Check the document count for the target index.
$ curl -sS "http://localhost:9200/app-events-bulk-2026.01/_count?pretty" { "count" : 2, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 } } - Run a simple search to confirm the indexed documents are queryable with the expected IDs and fields.
$ curl -sS -H "Content-Type: application/json" -X POST "http://localhost:9200/app-events-bulk-2026.01/_search?filter_path=hits.total,hits.hits._id,hits.hits._source&pretty" -d ' { "size": 2, "sort": [ { "timestamp": "asc" } ], "query": { "match_all": {} } }' { "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "hits" : [ { "_id" : "evt-1001", "_source" : { "timestamp" : "2026-04-02T06:15:00Z", "level" : "INFO", "message" : "service started" } }, { "_id" : "evt-1002", "_source" : { "timestamp" : "2026-04-02T06:16:12Z", "level" : "ERROR", "message" : "connection timeout" } } ] } }
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
