Shard allocation awareness keeps primary and replica copies of the same shard in different racks or availability zones, reducing the chance that a single infrastructure failure removes every copy of the data at once.

Each shard-holding node advertises a custom attribute such as node.attr.zone, and Elasticsearch uses cluster.routing.allocation.awareness.attributes to treat that attribute as a placement boundary during shard assignment and relocation. When enough nodes exist across the planned values, replicas can be placed on a different attribute value from their primaries instead of concentrating all copies in one fault domain.

Examples here target self-managed Elasticsearch nodes where /etc/elasticsearch/elasticsearch.yml is editable. Managed Elastic Cloud Hosted and Elastic Cloud Enterprise deployments handle zone awareness through deployment configuration instead, and awareness alone does not keep a cluster available after a zone outage unless master-eligible nodes are also laid out with quorum in mind.

Steps to configure shard allocation awareness in Elasticsearch:

  1. Identify the awareness attribute name and the full set of location values that the cluster should use.

    Common attributes are zone and rack_id. Every node that can hold shards must report one value for the chosen attribute, and at least one replica is needed if shard copies must survive the loss of one location.

  2. Set the node attribute in /etc/elasticsearch/elasticsearch.yml on each shard-holding node.
    node.attr.zone: az1

    Use one value per fault domain, such as az1 and az2, and keep the attribute name identical on every node.

  3. Restart each affected Elasticsearch node one at a time so the new node attribute is loaded.
    $ sudo systemctl restart elasticsearch

    Use a rolling restart in production. Restarting multiple shard-holding nodes at once can reduce availability and trigger heavy recovery work.

  4. Confirm that the nodes are reporting the awareness attribute to the cluster.
    $ curl -sS "http://localhost:9200/_nodes?filter_path=nodes.*.name,nodes.*.attributes.zone&pretty"
    {
      "nodes" : {
        "zNG68H58QQOqIBYj3IIg3g" : {
          "name" : "es-aware-1",
          "attributes" : {
            "zone" : "az1"
          }
        },
        "841h8EalRuKlBY-laJXtSg" : {
          "name" : "es-aware-2",
          "attributes" : {
            "zone" : "az2"
          }
        }
      }
    }

    Elasticsearch only allocates awareness-managed shards to nodes that have a value for the configured awareness attribute.

  5. Enable shard allocation awareness for the chosen attribute.
    $ curl -sS -H "Content-Type: application/json" -X PUT "http://localhost:9200/_cluster/settings?pretty" -d '{
      "persistent" : {
        "cluster.routing.allocation.awareness.attributes" : "zone"
      }
    }'
    {
      "acknowledged" : true,
      "persistent" : {
        "cluster" : {
          "routing" : {
            "allocation" : {
              "awareness" : {
                "attributes" : "zone"
              }
            }
          }
        }
      },
      "transient" : { }
    }

    Add TLS and authentication options when security is enabled, for example https://... with --user or Authorization: ApiKey ....

  6. Force awareness when missing locations should leave replicas unassigned instead of collapsing all copies into the surviving location.
    $ curl -sS -H "Content-Type: application/json" -X PUT "http://localhost:9200/_cluster/settings?pretty" -d '{
      "persistent" : {
        "cluster.routing.allocation.awareness.force.zone.values" : "az1,az2"
      }
    }'
    {
      "acknowledged" : true,
      "persistent" : {
        "cluster" : {
          "routing" : {
            "allocation" : {
              "awareness" : {
                "force" : {
                  "zone" : {
                    "values" : "az1,az2"
                  }
                }
              }
            }
          }
        }
      },
      "transient" : { }
    }

    Forced awareness protects placement by allowing replicas to stay UNASSIGNED until the missing location returns.

  7. Verify the current awareness settings in cluster state.
    $ curl -sS "http://localhost:9200/_cluster/settings?flat_settings=true&pretty"
    {
      "persistent" : {
        "cluster.routing.allocation.awareness.attributes" : "zone",
        "cluster.routing.allocation.awareness.force.zone.values" : "az1,az2"
      },
      "transient" : { }
    }
  8. Check cluster health during relocation or after enabling forced awareness.
    $ curl -sS "http://localhost:9200/_cluster/health/logs-force-awareness?wait_for_status=yellow&timeout=120s&pretty"
    {
      "cluster_name" : "sg-force-awareness",
      "status" : "yellow",
      "timed_out" : false,
      "number_of_nodes" : 2,
      "number_of_data_nodes" : 2,
      "active_primary_shards" : 2,
      "active_shards" : 2,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 2,
      "unassigned_primary_shards" : 0,
      "delayed_unassigned_shards" : 0,
      "number_of_pending_tasks" : 41,
      "number_of_in_flight_fetch" : 0,
      "task_max_waiting_in_queue_millis" : 1810,
      "active_shards_percent_as_number" : 50.0
    }

    A yellow cluster is expected when forced awareness is enabled but one of the configured values is not currently present. Replicas are assigned automatically after nodes from the missing location rejoin.

  9. Confirm that primary and replica shards are placed on nodes in different locations when all planned values are available.
    $ curl -sS "http://localhost:9200/_cat/shards/logs-awareness?v&h=index,shard,prirep,state,node"
    index          shard prirep state   node
    logs-awareness 0     p      STARTED es-aware-2
    logs-awareness 0     r      STARTED es-aware-1
    logs-awareness 1     r      STARTED es-aware-2
    logs-awareness 1     p      STARTED es-aware-1

    Compare the node names in the shard list with the zone mapping from the earlier _nodes output. If both copies of the same shard still land in one location, recheck the node attributes, replica count, and any competing allocation rules.