Production-grade Elasticsearch configuration keeps indexing latency, search latency, and cluster stability predictable by preventing swap thrashing, accidental split-cluster formation, and unprotected network exposure.
Self-managed Elasticsearch reads /etc/elasticsearch/elasticsearch.yml plus any JVM overrides in /etc/elasticsearch/jvm.options.d during startup, while systemd unit limits and kernel tunables such as vm.max_map_count must satisfy bootstrap checks once the node binds to a non-loopback network.host.
Heap, discovery, and security changes are disruptive and typically require a full service restart; incorrect seed hosts or bootstrap master lists can elect the wrong nodes, and exposing the transport port (default 9300) outside a trusted network can allow unauthorized cluster traffic.
Steps to configure Elasticsearch for production:
- Set JVM heap sizes in /etc/elasticsearch/jvm.options.d/heap.options.
-Xms4g -Xmx4g
Keep Xms and Xmx equal, keep heap under 50% of system RAM, keep heap at 32GB or less on typical JVMs to preserve compressed pointers.
- Create a sysctl drop-in file for vm.max_map_count at /etc/sysctl.d/99-elasticsearch.conf.
vm.max_map_count = 262144
Low vm.max_map_count is a common cause of Elasticsearch bootstrap check failures.
- Reload sysctl settings.
$ sudo sysctl --system * Applying /etc/sysctl.d/99-elasticsearch.conf ... vm.max_map_count = 262144 ##### snipped #####
- Create a systemd override for Elasticsearch limits at /etc/systemd/system/elasticsearch.service.d/override.conf.
[Service] LimitMEMLOCK=infinity LimitNOFILE=65535
LimitMEMLOCK allows memory locking, and LimitNOFILE reduces the risk of file descriptor exhaustion under shard load.
- Reload systemd unit files.
$ sudo systemctl daemon-reload
- Enable memory locking in /etc/elasticsearch/elasticsearch.yml.
bootstrap.memory_lock: true
Memory locking fails when the service lacks LimitMEMLOCK permissions, and swap usage can still occur if locking is not active.
- Disable swap for the current boot session.
$ sudo swapoff --all
Swap returns after reboot unless removed from /etc/fstab, and swap thrashing can cause long GC pauses during heavy indexing.
- Set production cluster identity, node name, node roles, network binding in /etc/elasticsearch/elasticsearch.yml.
cluster.name: prod-cluster node.name: es-1 node.roles: [ master, data, ingest ] network.host: 10.10.10.11 http.port: 9200 transport.port: 9300
Binding network.host to a non-loopback address enables bootstrap checks, and the node can refuse to start until OS limits plus discovery settings are correct.
- Define discovery settings in /etc/elasticsearch/elasticsearch.yml.
discovery.seed_hosts: ["10.10.10.11", "10.10.10.12", "10.10.10.13"] cluster.initial_master_nodes: ["es-1", "es-2", "es-3"]
cluster.initial_master_nodes is only needed when bootstrapping a new cluster and should be removed after the first successful master election.
- Apply default shard count, replica count settings with an index template.
- Enable TLS with authentication for the HTTP endpoint.
- Register a snapshot repository for backups.
- Restart Elasticsearch to apply configuration changes.
$ sudo systemctl restart elasticsearch
- Check the Elasticsearch service status for active state.
$ sudo systemctl status elasticsearch --no-pager ● elasticsearch.service - Elasticsearch Loaded: loaded (/lib/systemd/system/elasticsearch.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2026-01-02 09:14:32 UTC; 9s ago Docs: https://www.elastic.co ##### snipped ##### - Confirm memory locking is active from the node stats API.
$ curl -s http://localhost:9200/_nodes/stats/process?filter_path=nodes.*.process.mlockall&pretty { "nodes" : { "u7PpQ1u9Qe6y8iQmQb5m_w" : { "process" : { "mlockall" : true } } } }Security-enabled clusters require the appropriate curl authentication options and a trusted CA for local API calls.
- Check cluster health before opening the service to production traffic.
$ curl -s http://localhost:9200/_cluster/health?pretty { "cluster_name" : "prod-cluster", "status" : "green", "number_of_nodes" : 3 }Replace http with https and add the required authentication options when the HTTP endpoint is secured.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
