How to install Apache Hadoop on CentOS, Fedora, or Red Hat

On Red Hat family systems, Hadoop usually comes from Apache binary archives or a vendor distribution rather than the base OS repositories. A clean install needs Java 17, a dedicated Hadoop user, an unpacked Hadoop tree, and profile settings for command access.

Use package-managed Java from dnf where available, then install the Hadoop archive under /opt so upgrades can be staged by switching a symlink. Keep service configuration separate from the installation directory.

The local Hadoop runtime and command client should be installed before cluster daemon configuration, HDFS formatting, and YARN startup are handled by the follow-on cluster guides.

Steps to install Apache Hadoop on Red Hat family Linux:

  1. Install Java 17 and download tools.
    $ sudo dnf install java-17-openjdk-headless tar curl
    Installed:
      java-17-openjdk-headless
      tar
      curl
  2. Create a Hadoop service user.
    $ sudo useradd --system --home-dir /var/lib/hadoop --create-home --shell /bin/bash hadoop
  3. Download the Apache Hadoop binary archive.
    $ curl -fLO https://downloads.apache.org/hadoop/common/hadoop-3.5.0/hadoop-3.5.0.tar.gz
  4. Install the archive under /opt.
    $ sudo tar -xzf hadoop-3.5.0.tar.gz -C /opt
  5. Create a stable symlink and ownership.
    $ sudo ln -sfn /opt/hadoop-3.5.0 /opt/hadoop
  6. Set Hadoop environment variables.
    /etc/profile.d/hadoop.sh
    export JAVA_HOME=/usr/lib/jvm/jre-17-openjdk
    export HADOOP_HOME=/opt/hadoop
    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
  7. Verify the command runtime.
    $ /opt/hadoop/bin/hadoop version
    Hadoop 3.5.0
    Source code repository https://github.com/apache/hadoop -r 000000000000