How to configure MapReduce to run on YARN

MapReduce jobs use YARN only when the MapReduce framework is set to yarn and NodeManagers expose the shuffle service. If either side is missing, jobs can fail before reducers start or run in the wrong execution mode.

The configuration spans mapred-site.xml and yarn-site.xml. Set the framework name, configure the application classpath if the distribution requires it, enable the shuffle service, and restart YARN daemons.

Keep the MapReduce and YARN configuration identical across clients, ResourceManagers, and NodeManagers. A job submitted from a stale client can fail even after the cluster daemons are fixed.

Steps to configure MapReduce on YARN:

  1. Set MapReduce to use YARN.
    mapred-site.xml
    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>
  2. Configure the NodeManager shuffle service.
    yarn-site.xml
    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
    </property>
    <property>
      <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
  3. Distribute the updated configuration to every YARN node.
    $ rsync -a $HADOOP_CONF_DIR/ worker01.example.net:$HADOOP_CONF_DIR/
    mapred-site.xml
    yarn-site.xml
  4. Restart YARN daemons so NodeManagers load the shuffle service.
    $ stop-yarn.sh
    Stopping resourcemanager
    Stopping nodemanagers
  5. Run a small MapReduce job.
    $ yarn jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.5.0.jar pi 2 1000
    INFO mapreduce.Job: map 100% reduce 100%
    Estimated value of Pi is 3.14800000000000000000
  6. Confirm YARN tracked the job.
    $ yarn application -list -appStates FINISHED
    Total number of applications (application-types: [] and states: [FINISHED]):1
    Application-Id                  Application-Name   Application-Type   User    Queue     State     Final-State
    application_1720000000000_0042  QuasiMonteCarlo    MAPREDUCE          alice   default   FINISHED  SUCCEEDED