Removing a DataNode without decommissioning can reduce replication and create missing-block alerts. HDFS decommissioning marks the node as leaving service and gives the NameNode time to copy its blocks elsewhere before the host is shut down.

The include and exclude files define which DataNodes may serve the cluster. The procedure updates the exclude file, refreshes NameNode node state, waits for the node to reach Decommissioned, and only then stops the daemon.

Decommissioning needs enough remaining capacity to absorb the blocks from the host. Check cluster health before starting and avoid changing several large nodes at once unless the capacity plan already covers the movement.

Steps to decommission a Hadoop DataNode:

  1. Check HDFS health before removing the node.
    $ hdfs fsck / -blocks -locations
    Status: HEALTHY
     Total size: 184320000 B
     Total blocks (validated): 42
  2. Add the DataNode hostname to the exclude file.
    dfs.exclude
    worker02.example.net
  3. Confirm the active exclude path from Hadoop configuration.
    $ hdfs getconf -confKey dfs.hosts.exclude
    /etc/hadoop/dfs.exclude
  4. Refresh the NameNode node list.
    $ hdfs dfsadmin -refreshNodes
    Refresh nodes successful
  5. Watch the decommission state from the NameNode report.
    $ hdfs dfsadmin -report
    Name: worker02.example.net:9866
    Decommission Status : Decommission in progress
    Configured Capacity: 107374182400 (100 GB)
  6. Wait until the node reports Decommissioned.
    $ hdfs dfsadmin -report
    Name: worker02.example.net:9866
    Decommission Status : Decommissioned
    Under replicated blocks: 0
  7. Stop the DataNode daemon on the removed host.
    $ hdfs --daemon stop datanode
    Stopping datanode