Bulk indexing reduces HTTP overhead when a large batch of events, application records, or one-time imports must be written into Elasticsearch quickly. Grouping documents into one request keeps ingestion faster and more predictable than sending each document with a separate indexing call.
The Bulk API accepts NDJSON where each action line (such as index) is followed by the source document line for that action. A single request can write many documents, and the response returns item-level statuses so successful and failed writes can be checked separately instead of assuming the whole batch behaved the same way.
Current self-managed clusters typically use an authenticated HTTPS endpoint for these API calls. When the next action is a search, refresh=wait_for is preferred over forcing an immediate refresh, and stock 9.x nodes can reserve patterns such as logs-* for built-in data stream templates, so use an application-specific index name unless the target is intentionally a data stream.
$ curl -sS -H "Content-Type: application/json" -X PUT "http://localhost:9200/app-events-bulk-2026.01?pretty" -d '{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"timestamp": { "type": "date" },
"level": { "type": "keyword" },
"message": { "type": "text" }
}
}
}'
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "app-events-bulk-2026.01"
}
Stock Elasticsearch nodes can reject plain indices named logs-* because built-in templates reserve those patterns for data streams. Use an application-specific index name unless the target is intentionally a data stream.
$ cat > bulk.ndjson <<'BULK'
{ "index": { "_id": "evt-1001" } }
{ "timestamp": "2026-04-02T06:15:00Z", "level": "INFO", "message": "service started" }
{ "index": { "_id": "evt-1002" } }
{ "timestamp": "2026-04-02T06:16:12Z", "level": "ERROR", "message": "connection timeout" }
BULK
Because the target index is already in the request path, each action line only needs the operation and optional document ID. Each JSON object must be a single line, and the file must end with a trailing newline for the bulk parser to accept the final action.
$ curl -sS -H "Content-Type: application/x-ndjson" -X POST "http://localhost:9200/app-events-bulk-2026.01/_bulk?refresh=wait_for&filter_path=took,errors,items.*.status,items.*.result&pretty" --data-binary @bulk.ndjson
{
"errors" : false,
"took" : 1022,
"items" : [
{
"index" : {
"result" : "created",
"status" : 201
}
},
{
"index" : {
"result" : "created",
"status" : 201
}
}
]
}
Use --data-binary so curl preserves newlines exactly. Current Elasticsearch accepts either application/json or application/x-ndjson for Bulk API requests, and an HTTP 200 response can still contain failed items, so inspect errors and each item status before assuming the batch succeeded.
$ curl -sS "http://localhost:9200/app-events-bulk-2026.01/_count?pretty"
{
"count" : 2,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
}
}
$ curl -sS -H "Content-Type: application/json" -X POST "http://localhost:9200/app-events-bulk-2026.01/_search?filter_path=hits.total,hits.hits._id,hits.hits._source&pretty" -d '
{
"size": 2,
"sort": [
{ "timestamp": "asc" }
],
"query": {
"match_all": {}
}
}'
{
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"hits" : [
{
"_id" : "evt-1001",
"_source" : {
"timestamp" : "2026-04-02T06:15:00Z",
"level" : "INFO",
"message" : "service started"
}
},
{
"_id" : "evt-1002",
"_source" : {
"timestamp" : "2026-04-02T06:16:12Z",
"level" : "ERROR",
"message" : "connection timeout"
}
}
]
}
}