Pseudo-distributed mode runs Hadoop daemons on one host while still using HDFS and YARN service boundaries. It is the smallest setup that exercises NameNode, DataNode, ResourceManager, and NodeManager behavior together.

The configuration uses localhost-facing service addresses, local storage directories, and the same XML files used by larger clusters. Format HDFS only after the files are written and the directories are ready.

Use pseudo-distributed mode for development and training. It does not model multi-host failures, network partitions, rack awareness, or production security boundaries.

Steps to configure Hadoop pseudo-distributed mode:

  1. Set the default filesystem to a local HDFS NameNode.
    core-site.xml
    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://localhost:9000</value>
    </property>
  2. Set local NameNode and DataNode storage directories.
    hdfs-site.xml
    <property>
      <name>dfs.replication</name>
      <value>1</value>
    </property>
    <property>
      <name>dfs.namenode.name.dir</name>
      <value>file:///var/lib/hadoop/hdfs/name</value>
    </property>
    <property>
      <name>dfs.datanode.data.dir</name>
      <value>file:///var/lib/hadoop/hdfs/data</value>
    </property>
  3. Set MapReduce to use YARN.
    mapred-site.xml
    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>
  4. Set the YARN shuffle service.
    yarn-site.xml
    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
    </property>
  5. Format the local HDFS namespace.
    $ hdfs namenode -format -clusterId pseudo-local
    Storage directory /var/lib/hadoop/hdfs/name has been successfully formatted.
  6. Start HDFS and YARN.
    $ start-dfs.sh
    Starting namenodes on [localhost]
    Starting datanodes
    Starting secondary namenodes [localhost]
  7. Create the user home directory in HDFS.
    $ hdfs dfs -mkdir -p /user/hadoop
  8. Verify HDFS responds.
    $ hdfs dfs -ls /
    Found 2 items
    drwxr-xr-x   - hadoop supergroup          0 2026-06-17 03:00 /tmp
    drwxr-xr-x   - hadoop supergroup          0 2026-06-17 03:00 /user