A multi-node Apache Hadoop cluster fails in uneven ways when hostnames, worker lists, and daemon roles do not agree. The NameNode, ResourceManager, DataNode, and NodeManager processes need the same logical cluster settings before HDFS formatting or service startup begins.
The active configuration is the set of XML files under $HADOOP_CONF_DIR on every node. Treat the first master host as the configuration source, copy the same files to workers, and verify that each node resolves the same names before starting daemons.
The example uses one NameNode and one ResourceManager with two workers. Production clusters usually add high availability, Kerberos, rack awareness, and service supervision after this baseline is already working.
$ getent hosts master01 worker01 worker02 10.10.20.11 master01.example.net master01 10.10.20.21 worker01.example.net worker01 10.10.20.22 worker02.example.net worker02
<property> <name>fs.defaultFS</name> <value>hdfs://master01.example.net:9000</value> </property>
<property> <name>dfs.namenode.name.dir</name> <value>file:///data/hadoop/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///data/hadoop/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property>
<property> <name>yarn.resourcemanager.hostname</name> <value>master01.example.net</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
worker01.example.net worker02.example.net
$ rsync -a $HADOOP_CONF_DIR/ worker01.example.net:$HADOOP_CONF_DIR/ core-site.xml hdfs-site.xml yarn-site.xml mapred-site.xml workers
$ hdfs namenode -format -clusterId hadoop-lab01 Storage directory /data/hadoop/hdfs/name has been successfully formatted.
Formatting an existing NameNode destroys namespace metadata for that cluster. Run it only before the first startup or after a tested recovery plan.
$ start-dfs.sh Starting namenodes on [master01.example.net] Starting datanodes Starting secondary namenodes [master01.example.net]
Related: How to restart Hadoop services
$ hdfs dfsadmin -report Configured Capacity: 214748364800 (200 GB) Present Capacity: 209715200000 (195 GB) Live datanodes (2): Name: worker01.example.net:9866 Name: worker02.example.net:9866