Separating Elasticsearch data nodes into hot, warm, cold, frozen, and content tiers keeps recent indexing and search traffic on faster hardware while moving older or lower-priority data onto cheaper storage. This makes it easier to scale retention without forcing every shard onto the same class of disk.
Tier placement is controlled by the node.roles setting on each node and by the index-level index.routing.allocation.include._tier_preference setting on indices. Index Lifecycle Management (ILM) and data streams use these built-in tier roles instead of older custom node attributes, and a tier-preference list can fall back to a later tier when the preferred tier has no available nodes.
Setting node.roles explicitly replaces the default all-roles node profile, so every required role must be listed on purpose. Current Elastic guidance also warns not to mix the generic data role with specialized tier roles on the same node, and clusters that use specialized roles still need coverage for both data_content and data_hot. Role changes require restarts, so apply them with a rolling restart and recheck shard allocation after each node rejoins. Secured clusters may also require https://, authentication, for the API calls shown here.
Steps to configure data tier nodes in Elasticsearch:
- Record the current node names and advertised roles before changing the tier layout.
$ curl -sS --fail "http://localhost:9200/_nodes?filter_path=nodes.*.name,nodes.*.roles&pretty" { "nodes" : { "tV7XQ2fKQ1a8YpL6sJm3Hg" : { "name" : "es-hot-1", "roles" : [ "data_content", "data_hot", "ingest", "master" ] }, "B2dPW4rMTuq9gL1nYc7VeA" : { "name" : "es-warm-1", "roles" : [ "data_warm" ] }, "p8Nf2CqSSeO5mR4kTz1UwQ" : { "name" : "es-cold-1", "roles" : [ "data_cold" ] } } }Use the actual node names from this output when mapping hosts to hot, warm, cold, frozen, or content responsibilities.
- Decide which nodes will provide the required hot and content coverage before editing the configuration.
With specialized roles, keep at least one data_hot node for time-series writes and at least one data_content node for system indices and other non-data-stream indices. A node may belong to multiple tiers when the hardware and workload justify it.
- Set explicit tier roles in /etc/elasticsearch/elasticsearch.yml on each affected node.
# Hot/content node node.roles: [ master, ingest, data_content, data_hot ] # Warm node node.roles: [ data_warm ] # Cold node node.roles: [ data_cold ]
Do not combine data with data_hot, data_warm, data_cold, data_frozen, or data_content on the same node. Keep any additional non-tier roles the node still needs, such as master, ingest, ml, transform, or remote_cluster_client.
Use data_frozen only for frozen searchable-snapshot nodes. The frozen tier stores partially mounted indices and requires a snapshot repository.
- Restart each affected Elasticsearch node one at a time so the new tier roles are loaded.
$ sudo systemctl restart elasticsearch
Use a rolling restart in production. Restarting multiple shard-holding nodes together can reduce availability and trigger heavy relocation work.
- Confirm the restarted node now advertises the intended tier roles.
$ curl -sS --fail "http://localhost:9200/_nodes/_local?filter_path=nodes.*.name,nodes.*.roles&pretty" { "nodes" : { "Wfu7UyTbSw2Oieua8A0mGQ" : { "name" : "es-hot-content-1", "roles" : [ "data_content", "data_hot", "ingest", "master" ] } } }The node info API returns explicit role names, which is easier to audit than the condensed role letters used by the CAT node views.
- Verify the full cluster still exposes the tier layout that ILM policies and templates expect.
$ curl -sS --fail "http://localhost:9200/_nodes?filter_path=nodes.*.name,nodes.*.roles&pretty" { "nodes" : { "tV7XQ2fKQ1a8YpL6sJm3Hg" : { "name" : "es-hot-1", "roles" : [ "data_content", "data_hot", "ingest", "master" ] }, "B2dPW4rMTuq9gL1nYc7VeA" : { "name" : "es-warm-1", "roles" : [ "data_warm" ] }, "p8Nf2CqSSeO5mR4kTz1UwQ" : { "name" : "es-cold-1", "roles" : [ "data_cold" ] } } }If a tier referenced by an ILM phase or index template is missing, shards targeting only that tier can stay unassigned until matching nodes return.
- Check the tier preference on a representative index after the node-role change.
$ curl -sS --fail "http://localhost:9200/tier-check-000001/_settings?filter_path=*.settings.index.routing.allocation.include._tier_preference,*.settings.index.number_of_replicas&pretty" { "tier-check-000001" : { "settings" : { "index" : { "routing" : { "allocation" : { "include" : { "_tier_preference" : "data_content" } } }, "number_of_replicas" : "0" } } } }Replace tier-check-000001 with a real index or backing index from the workload being tiered. Directly created indices default to data_content. Data stream backing indices enter the hot tier instead.
- Check cluster health and confirm a representative shard is allocating to an available tier node.
$ curl -sS --fail "http://localhost:9200/_cluster/health/tier-check-000001?filter_path=cluster_name,status,number_of_nodes,number_of_data_nodes,active_primary_shards,active_shards,unassigned_shards,active_shards_percent_as_number&pretty" { "cluster_name" : "sg-tier-verify", "status" : "green", "number_of_nodes" : 1, "number_of_data_nodes" : 1, "active_primary_shards" : 1, "active_shards" : 1, "unassigned_shards" : 0, "active_shards_percent_as_number" : 100.0 }$ curl -sS --fail "http://localhost:9200/_cat/shards/tier-fallback-000001?v&h=index,shard,prirep,state,node" index shard prirep state node tier-fallback-000001 0 p STARTED es-hot-content-1
An index with _tier_preference set to data_warm,data_hot can still allocate to a hot node when no warm node is available. If health stays yellow or red after a tier change, recheck replica counts, tier coverage, and any competing allocation filters.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
