Removing a node from an Elasticsearch cluster is routine during host replacement, hardware retirement, and capacity changes. Draining shards before the service stops keeps primary and replica copies available so searches and indexing can continue through the maintenance window.
Shard placement is controlled by the cluster allocator. Applying a temporary cluster.routing.allocation.exclude rule tells Elasticsearch to move shard copies away from the departing data node while the node stays online, giving the cluster time to rebalance before the process is stopped.
Recent self-managed Elasticsearch deployments often use TLS and authentication by default, so the API calls may require https, credentials, and the cluster CA file instead of plain http. When the departing node is master-eligible, confirm the remaining master-eligible nodes can still form a quorum; removing half or more of them in a short period requires a temporary voting configuration exclusion before the node is stopped.
$ curl -sS "http://localhost:9200/_cat/nodes?v&h=ip,name,node.role,master" ip name node.role master 192.0.2.41 node-02 cdfhilmrstw * 192.0.2.43 node-03 cdfhilmrstw - 192.0.2.40 node-01 cdfhilmrstw -
If the cluster uses the default security setup, replace http with https and add credentials plus the cluster CA, for example:
$ curl -sS -u "elastic:%%password%%" "https://localhost:9200/_cat/nodes?v&h=ip,name,node.role,master"
$ curl -sS "http://localhost:9200/_cluster/health?pretty"
{
"cluster_name" : "search-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 3,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Proceeding while status is red, or while primary shards are unassigned, can make indices unavailable during the removal.
$ curl -sS -H "Content-Type: application/json" -X PUT "http://localhost:9200/_cluster/settings" -d '{
"persistent": {
"cluster.routing.allocation.exclude._name": "node-03"
}
}'
{
"acknowledged" : true,
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"exclude" : {
"_name" : "node-03"
}
}
}
}
},
"transient" : { }
}
Use cluster.routing.allocation.exclude._id or cluster.routing.allocation.exclude._ip when a persistent node identifier or IP address is more reliable than node.name. Dedicated master-only and coordinating-only nodes with no shards can usually skip this step.
$ curl -sS "http://localhost:9200/_cluster/health?wait_for_no_relocating_shards=true&timeout=30m&pretty"
{
"cluster_name" : "search-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 3,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
Increase timeout for large nodes or slower storage. Nodes without data can move to the service stop step after the cluster health and master-eligibility checks are complete.
$ curl -sS "http://localhost:9200/_cat/allocation?v&h=shards,disk.indices,disk.used,disk.avail,disk.total,disk.percent,host,ip,node" shards disk.indices disk.used disk.avail disk.total disk.percent host ip node 3 10kb 40gb 18.2gb 58.3gb 68 192.0.2.40 192.0.2.40 node-01 0 0b 40gb 18.2gb 58.3gb 68 192.0.2.43 192.0.2.43 node-03 3 10kb 40gb 18.2gb 58.3gb 68 192.0.2.41 192.0.2.41 node-02
If the shard count does not reach 0, remaining allocation rules, insufficient free capacity, or missing replica copies can block relocation.
$ sudo systemctl stop elasticsearch
If the departing node is master-eligible and its removal takes half or more of the master-eligible nodes out of service, add a voting configuration exclusion before stopping it:
$ curl -sS -X POST "http://localhost:9200/_cluster/voting_config_exclusions?node_names=node-03"
Clear it after the node leaves the cluster with
$ curl -sS -X DELETE "http://localhost:9200/_cluster/voting_config_exclusions"
.
If the host is being permanently decommissioned, follow the stop with sudo systemctl disable elasticsearch so the node does not rejoin on the next boot.
$ curl -sS "http://localhost:9200/_cat/nodes?v&h=ip,name,node.role,master" ip name node.role master 192.0.2.41 node-02 cdfhilmrstw * 192.0.2.40 node-01 cdfhilmrstw -
$ curl -sS -H "Content-Type: application/json" -X PUT "http://localhost:9200/_cluster/settings" -d '{
"persistent": {
"cluster.routing.allocation.exclude._name": null
}
}'
{
"acknowledged" : true,
"persistent" : { },
"transient" : { }
}
Clearing the exclusion prevents it from blocking a replacement node that reuses the same node.name or address. If a voting configuration exclusion was added earlier, clear it separately after the node has fully left the cluster.
$ curl -sS "http://localhost:9200/_cluster/health?pretty"
{
"cluster_name" : "search-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 3,
"active_shards" : 6,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}