Disk errors in Linux often appear as I/O error messages, slow reads, or files that become inaccessible without warning. Prompt correction reduces the risk of prolonged downtime, unexpected crashes, and difficult data recovery on critical systems.

At a technical level, problems can exist in several layers: filesystem metadata, block-device mapping, or the physical storage medium. Tools such as fsck repair journal and metadata inconsistencies, smartctl queries on-disk S.M.A.R.T. attributes, and badblocks performs sector-level read tests to uncover unreadable regions.

Safe repair depends on working against unmounted filesystems, reliable power, and current backups for important data. Root filesystems or busy volumes typically require booting from a live Linux environment before running checks, and long scans on large disks can take hours, so maintenance windows and log monitoring are essential.

Steps to repair disk errors in Linux:

  1. List the block devices and partitions present on the system.
    $ lsblk
    NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
    loop0    7:0    0   512M  0 loop /mnt/bench
    loop1    7:1    0   256M  0 loop /mnt/errorcheck
    loop2    7:2    0   256M  0 loop /mnt/repair
    ##### snipped #####
    vda    254:0    0   1.8T  0 disk 
    `-vda1 254:1    0   1.8T  0 part /etc/hosts
                                     /etc/hostname
                                     /etc/resolv.conf
    vdb    254:16   0 606.5M  1 disk 
    ##### snipped #####

    The lsblk output helps identify the device and partition names to target during repair.

  2. Unmount the target filesystem that requires repair.
    $ sudo umount /dev/loop2

    Replace loop2 with the correct partition identifier and ensure the filesystem is fully unmounted before continuing; if unmount fails because the filesystem is in use, run the repair from a live Linux environment instead.

  3. Run fsck on the unmounted filesystem to detect and repair logical errors.
    $ sudo fsck -f -y /dev/loop2
    fsck from util-linux 2.39.3
    e2fsck 1.47.0 (5-Feb-2023)
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    repair: 11/65536 files (0.0% non-contiguous), 8268/65536 blocks

    Running fsck against a mounted filesystem risks immediate data corruption, so checks must occur only after a successful unmount of the target partition.

  4. Display the disk S.M.A.R.T. health summary to identify hardware faults.
    $ sudo smartctl -H /dev/loop2
    smartctl 7.4 2023-08-01 r5530 [aarch64-linux-6.12.54-linuxkit] (local build)
    Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
    
    /dev/loop2: Unable to detect device type
    Please specify device type with the -d option.
    
    Use smartctl -h to get a usage summary

    If smartctl is unavailable, install the smartmontools package from the distribution repositories; some loop-backed or virtual devices will not expose hardware health data.

  5. Scan the filesystem for unreadable sectors using badblocks in non-destructive mode.
    $ sudo badblocks -v /dev/loop2
    Checking blocks 0 to 262143
    Checking for bad blocks (read-only test): done
    Pass completed, 0 bad blocks found. (0/0/0 errors)

    Non-destructive scans with badblocks -v only read data, but enabling write-mode with -w overwrites existing contents and should be used only after verified backups and a planned maintenance window.

  6. Mount the repaired filesystem at a temporary mount point.
    $ sudo mount /dev/loop2 /mnt/repair

    Use the original mount point or adjust /etc/fstab entries as needed before returning the filesystem to production workloads.

  7. Verify that the filesystem mounts cleanly and that space usage looks normal.
    $ df -h /mnt/repair
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/loop2      224M   24K  206M   1% /mnt/repair

    Absence of new I/O error messages in dmesg or system logs, combined with a successful mount and expected capacity values, indicates that disk errors have been addressed.