How to configure data tier nodes in Elasticsearch

Separating nodes into data tiers keeps indexing and search responsive on recent data while moving older indices onto cheaper storage to control cost.

Elasticsearch places shards based on node roles, including tier roles such as data_hot, data_warm, data_cold, and data_frozen. Index Lifecycle Management (ILM) uses these roles alongside the index routing setting index.routing.allocation.include._tier_preference to migrate indices between tiers without custom node attributes.

Changing node.roles replaces the node's default role set and requires a restart to take effect. Ensure enough nodes exist in each tier to satisfy number_of_replicas for indices that will land there, otherwise replica shards remain unassigned and cluster health stays yellow or red.

Steps to configure data tier nodes in Elasticsearch:

List current node names to map each host to an intended data tier.

$ curl -s "http://localhost:9200/_nodes?filter_path=nodes.*.name&pretty"
{
  "nodes" : {
    "S1nZfU1MQPqgJ8a2bWvH9Q" : {
      "name" : "es-hot-1"
    },
    "K9b3pH2yR4u7tX8wZ0aBcD" : {
      "name" : "es-warm-1"
    }
##### snipped #####
  }
}

Set tier roles for each node in /etc/elasticsearch/elasticsearch.yml.
```
node.roles: [ data_hot, ingest ]
```
Setting node.roles overrides the default role set on that node. Use tier roles that match the storage intent (data_hot, data_warm, data_cold, data_frozen), include data_content on nodes intended for non-time-series content indices, and retain any required non-tier roles such as master, ingest, ml, transform, or remote_cluster_client.
Restart the Elasticsearch service.
```
$ sudo systemctl restart elasticsearch
```
Restarting multiple nodes at the same time can cause service disruption or trigger extensive shard relocation. Apply role changes using a rolling restart in multi-node clusters.

Related: How to manage the Elasticsearch service with systemctl in Linux

Verify tier roles in node details.

$ curl -s "http://localhost:9200/_nodes/es-hot-1?filter_path=nodes.*.roles&pretty"
{
  "nodes" : {
    "S1nZfU1MQPqgJ8a2bWvH9Q" : {
      "roles" : [ "data_hot", "ingest" ]
    }
  }
}

Verify tier roles across the cluster to confirm each tier is represented.

$ curl -s "http://localhost:9200/_nodes?filter_path=nodes.*.name,nodes.*.roles&pretty"
{
  "nodes" : {
    "S1nZfU1MQPqgJ8a2bWvH9Q" : {
      "name" : "es-hot-1",
      "roles" : [ "data_hot", "ingest" ]
    },
    "K9b3pH2yR4u7tX8wZ0aBcD" : {
      "name" : "es-warm-1",
      "roles" : [ "data_warm" ]
    }
##### snipped #####
  }
}

Check cluster health to confirm nodes rejoined with shards allocated.

$ curl -s "http://localhost:9200/_cluster/health?pretty"
{
  "cluster_name" : "elasticsearch",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 42,
  "active_shards" : 74,
  "unassigned_shards" : 10
##### snipped #####
}

yellow typically indicates unassigned replicas. Unassigned shards that appear after tiering changes often indicate insufficient nodes in a tier to satisfy number_of_replicas for indices targeting that tier.

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.