Disk health checks in Linux help catch failing media, worn flash, and firmware-recorded read errors before they become filesystem damage, failed backups, or an unplanned replacement. smartctl reads the drive's S.M.A.R.T. health data while the disk is online, so it is the first check to run when a physical disk starts looking suspicious.

Most ATA, SATA, SAS, and NVMe devices store health counters, error logs, and self-test history in firmware. A quick --health check reports whether the drive has crossed its own failure threshold, while --xall exposes counters such as reallocated sectors, pending sectors, media errors, and spare capacity that usually matter before the single health line changes.

These checks usually require sudo and only work when the running system can address the physical disk or a controller that passes the data through. USB bridges, hardware RAID controllers, cloud volumes, and guest-visible virtual disks can hide S.M.A.R.T. data, so treat an unavailable result as a storage-path limitation until smartctl with --scan-open or the host/controller layer confirms what can be checked.

Steps to check disk health in Linux:

  1. List the whole-disk device nodes.
    $ lsblk -d -e 7,11 -o NAME,PATH,SIZE,MODEL,TRAN
    NAME    PATH             SIZE MODEL                TRAN
    sda     /dev/sda         477G Samsung SSD 870 EVO  sata
    nvme0n1 /dev/nvme0n1     1.8T Samsung SSD 980     nvme

    Use a whole disk such as /dev/sda or /dev/nvme0n1 in the next commands. Do not use a partition such as /dev/sda1 because S.M.A.R.T. data belongs to the physical disk or controller-visible disk.

  2. Read the device identity and health summary.
    $ sudo smartctl --info --health /dev/sda
    === START OF INFORMATION SECTION ===
    Device Model:     Samsung SSD 870 EVO 500GB
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED

    PASSED means the drive firmware is not reporting a current threshold failure. It does not rule out growing error counters, so continue with the full report before deciding the disk is safe to keep in service.

  3. Review the full S.M.A.R.T. report.
    $ sudo smartctl --xall /dev/sda
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    SMART Attributes Data Structure revision number: 16
    ID# ATTRIBUTE_NAME          VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      5 Reallocated_Sector_Ct  100   100   010    Pre-fail  Always       -       0
    194 Temperature_Celsius     069   058   000    Old_age   Always       -       31
    197 Current_Pending_Sector 100   100   000    Old_age   Always       -       0
    198 Offline_Uncorrectable  100   100   000    Old_age   Offline      -       0
    ##### snipped #####
    SMART Error Log Version: 1
    No Errors Logged

    Growing Reallocated_Sector_Ct, Current_Pending_Sector, Offline_Uncorrectable, Media and Data Integrity Errors, or Available Spare dropping to its threshold points toward data copy and disk replacement instead of repeated checks.

  4. Start a short self-test when the disk supports it.
    $ sudo smartctl --test=short /dev/sda
    === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
    Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
    Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
    Testing has begun.
    Please wait 2 minutes for test to complete.

    A short self-test is non-destructive, but it still reads the device in the background and can add latency on a busy disk. Run it during a maintenance window when storage latency matters.

  5. Read the self-test log after the reported wait time.
    $ sudo smartctl --log=selftest /dev/sda
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Short offline       Completed without error       00%          18423         -

    Completed without error confirms that the latest built-in diagnostic finished without recording a failing block. If the status shows a read failure or an LBA_of_first_error value, copy data off the disk and plan replacement.

  6. Discover the usable device type when smartctl cannot open the disk normally.
    $ sudo smartctl --scan-open
    /dev/sda -d sat # /dev/sda [SAT], ATA device
    /dev/nvme0 -d nvme # /dev/nvme0, NVMe device

    --scan-open opens detected devices and prints the type smartctl can use. Common examples include --device=sat for many USB-to-SATA bridges, --device=nvme for NVMe devices, and controller-specific forms such as --device=megaraid,N.

  7. Rerun the report with the detected device type.
    $ sudo smartctl --xall --device=sat /dev/sda
    === START OF INFORMATION SECTION ===
    Device Model:     Samsung SSD 870 EVO 500GB
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    ##### snipped #####

    If a guest-visible VM disk, cloud volume, or hardware controller still reports unavailable S.M.A.R.T. support, run the health check on the host, storage controller, or provider surface that owns the physical disk.