Shard allocation decisions determine whether Elasticsearch can start a primary, place a replica, or move a shard during rebalancing. When a shard stays UNASSIGNED or refuses to move, the allocation explain API identifies the exact decider blocking progress so recovery work targets the real constraint instead of trial-and-error setting changes.

The /_cluster/allocation/explain API evaluates one shard against the current cluster state and returns whether Elasticsearch can allocate it, whether it may remain on its current node, and which deciders produced NO, THROTTLE, or YES results on each candidate node. It can explain an unassigned primary or replica, and with current_node it can also explain why an assigned shard stays where it is or whether it can rebalance elsewhere.

Current responses can be large on busy clusters, especially when include_yes_decisions or include_disk_info is enabled, so filter_path is useful for the first pass. On secured clusters, switch the endpoint to https:// and add authentication. A call with no request body explains an arbitrary unassigned shard and returns 400 when no unassigned shards exist, while allocation_delayed and throttled usually indicate operational wait states rather than a setting mismatch.

Steps to explain shard allocation decisions in Elasticsearch:

  1. List unassigned shards and capture the target index, shard number, and primary or replica role.
    $ curl -sS --fail "http://localhost:9200/_cat/shards?v=true&h=index,shard,prirep,state,node,unassigned.reason&s=state"
    index        shard prirep state      node unassigned.reason
    logs-2026.05 0     p      UNASSIGNED      INDEX_CREATED

    Replace http://localhost:9200 with the cluster endpoint. On secured clusters, use https:// plus authentication such as --user or an ApiKey header.

    The prirep column shows p for primaries and r for replicas. A yellow cluster with only replica rows unassigned often points to a capacity or single-node issue rather than lost primary data.

  2. Request a focused allocation explanation for the selected shard.
    $ curl -sS --fail -H "Content-Type: application/json" -X POST \
      "http://localhost:9200/_cluster/allocation/explain?filter_path=index,shard,primary,current_state,unassigned_info.reason,unassigned_info.last_allocation_status,can_allocate,allocate_explanation,node_allocation_decisions.node_name,node_allocation_decisions.node_decision,node_allocation_decisions.deciders.*&pretty" \
      -d '{
      "index": "logs-2026.05",
      "shard": 0,
      "primary": true
    }'
    {
      "index" : "logs-2026.05",
      "shard" : 0,
      "primary" : true,
      "current_state" : "unassigned",
      "unassigned_info" : {
        "reason" : "INDEX_CREATED",
        "last_allocation_status" : "no"
      },
      "can_allocate" : "no",
      "allocate_explanation" : "Elasticsearch isn't allowed to allocate this shard to any of the nodes in the cluster. Choose a node to which you expect this shard to be allocated, find this node in the node-by-node explanation, and address the reasons which prevent Elasticsearch from allocating this shard there.",
      "node_allocation_decisions" : [
        {
          "node_name" : "node-01",
          "node_decision" : "no",
          "deciders" : [
            {
              "decider" : "filter",
              "decision" : "NO",
              "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"nonexistent_node\"]"
            }
          ]
        }
      ]
    }

    Use the first NO or THROTTLE decider on the intended node as the real blocker. That explanation usually names the exact setting, tier rule, watermark, or retry condition to fix.

    Omitting the request body explains an arbitrary unassigned primary or replica shard, returning any unassigned primary shards first. Add current_node in the body when the shard is already assigned and the question is why it remains or cannot rebalance.

  3. Re-run the request with include_yes_decisions and include_disk_info when the first pass does not show enough context.
    $ curl -sS --fail -H "Content-Type: application/json" -X POST \
      "http://localhost:9200/_cluster/allocation/explain?include_yes_decisions=true&include_disk_info=true&filter_path=cluster_info.nodes.*.least_available.*,node_allocation_decisions.node_name,node_allocation_decisions.node_decision,node_allocation_decisions.deciders.*&pretty" \
      -d '{
      "index": "logs-2026.05",
      "shard": 0,
      "primary": true
    }'
    {
      "cluster_info" : {
        "nodes" : {
          "node-id-01" : {
            "least_available" : {
              "path" : "/usr/share/elasticsearch/data",
              "total_bytes" : 62671097856,
              "used_bytes" : 43620786176,
              "free_bytes" : 19050311680,
              "free_disk_percent" : 30.4,
              "used_disk_percent" : 69.6
            }
          }
        }
      },
      "node_allocation_decisions" : [
        {
          "node_name" : "node-01",
          "node_decision" : "no",
          "deciders" : [
            {
              "decider" : "enable",
              "decision" : "YES",
              "explanation" : "all allocations are allowed"
            },
            {
              "decider" : "filter",
              "decision" : "NO",
              "explanation" : "node does not match index setting [index.routing.allocation.include] filters [_name:\"nonexistent_node\"]"
            },
            {
              "decider" : "disk_threshold",
              "decision" : "YES",
              "explanation" : "enough disk for shard on node, free: [17.7gb], used: [69.6%], shard size: [0b], free after allocating shard: [17.7gb]"
            }
    ##### snipped #####
          ]
        }
      ]
    }

    Search the expanded response for the first \"decision\" : \"NO\" or \"decision\" : \"THROTTLE\" on the node that should receive the shard. The YES lines confirm what is already permitted, while the disk section shows whether watermarks are part of the decision.

  4. Apply the smallest fix that matches the first blocking decider.
    $ curl -sS --fail -H "Content-Type: application/json" -X PUT "http://localhost:9200/logs-2026.05/_settings?pretty" -d '{
      "index.routing.allocation.include._name": null
    }'
    {
      "acknowledged" : true
    }

    filter usually means correcting or clearing index-level or cluster-level allocation filters. same_shard on a single-node cluster means a replica cannot live on the same node as its primary, so either add another data node or reduce number_of_replicas. disk_threshold means free space or revisit watermarks carefully, while allocation_delayed and throttled usually clear after node recovery or recovery backlog subsides.

    If the blocking decider is max_retry, fix the underlying problem first and then call POST /_cluster/reroute?retry_failed&metric=none to retry allocation.

    Changing allocation rules can trigger relocations and recovery traffic across disk, CPU, and network resources.

  5. Confirm the shard leaves UNASSIGNED and reaches STARTED.
    $ curl -sS --fail "http://localhost:9200/_cat/shards/logs-2026.05?v=true&h=index,shard,prirep,state,node&s=state"
    index        shard prirep state   node
    logs-2026.05 0     p      STARTED node-01

    If the shard still shows UNASSIGNED, rerun the explain API immediately. The new response often changes from the original blocker to the next remaining one.

  6. Verify the index or cluster returns to the expected health state after the fix.
    $ curl -sS --fail "http://localhost:9200/_cluster/health/logs-2026.05?filter_path=status,active_primary_shards,active_shards,unassigned_shards,initializing_shards,relocating_shards&pretty"
    {
      "status" : "green",
      "active_primary_shards" : 1,
      "active_shards" : 1,
      "relocating_shards" : 0,
      "initializing_shards" : 0,
      "unassigned_shards" : 0
    }

    A single-node cluster stays yellow when replicas are configured because Elasticsearch will never place a replica on the same node as its primary. In that case, reduce number_of_replicas or add another data node instead of chasing unrelated allocation settings.

  7. Explain an assigned shard with current_node when the issue is placement or rebalancing instead of initial allocation.
    $ curl -sS --fail -H "Content-Type: application/json" -X POST \
      "http://localhost:9200/_cluster/allocation/explain?filter_path=index,shard,primary,current_state,current_node.name,can_remain_on_current_node,can_rebalance_cluster,can_rebalance_to_other_node,rebalance_explanation,node_allocation_decisions.node_name,node_allocation_decisions.node_decision&pretty" \
      -d '{
      "index": "logs-2026.05",
      "shard": 0,
      "primary": true,
      "current_node": "node-01"
    }'
    {
      "index" : "logs-2026.05",
      "shard" : 0,
      "primary" : true,
      "current_state" : "started",
      "current_node" : {
        "name" : "node-01"
      },
      "can_remain_on_current_node" : "yes",
      "can_rebalance_cluster" : "yes",
      "can_rebalance_to_other_node" : "no",
      "rebalance_explanation" : "This shard is in a well-balanced location and satisfies all allocation rules so it will remain on this node. Elasticsearch cannot improve the cluster balance by moving it to another node. If you expect this shard to be rebalanced to another node, find the other node in the node-by-node explanation and address the reasons which prevent Elasticsearch from rebalancing this shard there."
    }

    When can_remain_on_current_node is no, rerun without an aggressive filter_path and inspect can_remain_decisions plus node_allocation_decisions to find the mismatched filter, tier preference, or balance constraint.