Removing a node from an Elasticsearch cluster is a common maintenance task for decommissioning, hardware replacement, or capacity reshaping. A clean removal drains shards first so the cluster stays available and avoids unassigned primaries during the shutdown window.
Shard placement is controlled by the cluster allocator, which assigns shard copies to eligible data nodes based on allocation rules and available capacity. Excluding a node from allocation marks it ineligible for new shard placements, prompting Elasticsearch to relocate shards away while the node remains online.
Replica availability and cluster quorum determine how safe the removal is. Proceed only when shard copies can be relocated to other nodes, and ensure enough master-eligible nodes remain to keep the cluster stable after the node is stopped.
Steps to remove a node from an Elasticsearch cluster:
- List cluster nodes and note the exact node name to remove.
$ curl -s "http://localhost:9200/_cat/nodes?v" ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name 192.0.2.41 70 88 2 2.84 2.92 2.80 cdfhilmrstw * node-02 192.0.2.42 28 88 2 2.84 2.92 2.80 cdfhilmrstw - node-03 192.0.2.40 43 88 2 2.84 2.92 2.80 cdfhilmrstw - node-01 192.0.2.43 58 88 2 2.84 2.92 2.80 cdfhilmrstw - node-04
Node names map to node.name in /etc/elasticsearch/elasticsearch.yml on package-based installs.
- Check cluster health before starting the drain.
$ curl -s "http://localhost:9200/_cluster/health?pretty" { "cluster_name" : "search-cluster", "status" : "green", "timed_out" : false, "number_of_nodes" : 4, "number_of_data_nodes" : 4, "active_primary_shards" : 3, "active_shards" : 6, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }Proceeding while status is red can leave primary shards unassigned, making indices unavailable.
- Exclude the node from shard allocation to trigger relocation.
$ curl -s -H "Content-Type: application/json" -X PUT "http://localhost:9200/_cluster/settings" -d '{ "persistent": { "cluster.routing.allocation.exclude._name": "node-04" } }' { "acknowledged" : true, "persistent" : { "cluster" : { "routing" : { "allocation" : { "exclude" : { "_name" : "node-04" } } } } }, "transient" : { } }Use cluster.routing.allocation.exclude._ip when excluding by address is more reliable than node.name.
- Wait for shard relocation to complete before stopping the node.
$ curl -s "http://localhost:9200/_cluster/health?wait_for_no_relocating_shards=true&timeout=30m&pretty" { "cluster_name" : "search-cluster", "status" : "green", "timed_out" : false, "number_of_nodes" : 4, "number_of_data_nodes" : 4, "active_primary_shards" : 3, "active_shards" : 6, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 1, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }Increase timeout for large nodes or slow storage.
- Confirm the excluded node is hosting zero shards.
$ curl -s "http://localhost:9200/_cat/allocation?v" | grep -E "shards|node-04" shards disk.indices disk.used disk.avail disk.total disk.percent host ip node 0 0b 114.9gb 1.6tb 1.7tb 6 192.0.2.43 192.0.2.43 node-04
If the shard count never reaches 0, some indices may lack replicas or allocation filters may be blocking relocation.
- Disable the Elasticsearch service on the node being removed.
$ sudo systemctl disable --now elasticsearch Removed "/etc/systemd/system/multi-user.target.wants/elasticsearch.service".
Stopping a master-eligible node without enough remaining masters can cause loss of quorum, making the cluster unavailable.
- Verify the node is no longer listed after the service is stopped.
$ curl -s "http://localhost:9200/_cat/nodes?v" ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name 192.0.2.40 46 76 2 2.94 2.95 2.82 cdfhilmrstw - node-01 192.0.2.41 21 76 2 2.94 2.95 2.82 cdfhilmrstw * node-02 192.0.2.42 31 76 2 2.94 2.95 2.82 cdfhilmrstw - node-03
- Clear the allocation exclusion after the node is offline.
$ curl -s -H "Content-Type: application/json" -X PUT "http://localhost:9200/_cluster/settings" -d '{ "persistent": { "cluster.routing.allocation.exclude._name": null } }' { "acknowledged" : true, "persistent" : { }, "transient" : { } }Clearing prevents the exclusion from affecting a replacement node that reuses the same node.name.
- Verify the cluster is healthy after the removal completes.
$ curl -s "http://localhost:9200/_cluster/health?pretty" { "cluster_name" : "search-cluster", "status" : "green", "timed_out" : false, "number_of_nodes" : 3, "number_of_data_nodes" : 3, "active_primary_shards" : 3, "active_shards" : 6, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
