How to fix disk errors in Linux

Disk errors on a Linux filesystem can flip a writable mount read-only, leave directory metadata inconsistent, or stop applications when the kernel cannot complete writes. Repairing the affected filesystem quickly limits secondary damage such as truncated data, failed backups, or repeated recovery attempts during boot.

On a typical ext4 data volume, the repair path is to identify the filesystem device with lsblk -f, unmount that filesystem, and run fsck against the filesystem device instead of the whole disk. The fsck front end hands ext2, ext3, and ext4 repairs to e2fsck, while disk-level health checks use smartctl on the parent disk such as /dev/sdb or /dev/nvme0n1.

Back up important data before writing repairs, especially when the same volume is logging repeated I/O error messages or remounting itself read-only. Busy root filesystems are safer to repair from live or rescue media, and filesystems such as XFS and Btrfs need their own repair tools instead of an ext4-style fsck sequence.

Steps to fix disk errors in Linux:

  1. Identify the filesystem device and parent disk for the affected mount.
    $ lsblk -f
    NAME   FSTYPE FSVER LABEL UUID                                 FSAVAIL FSUSE% MOUNTPOINTS
    sdb
    `-sdb1 ext4   1.0   data  064e84d3-b7e4-45d0-89ba-6b28df345687  452M      1% /srv/data

    Run fsck and e2fsck against the filesystem device such as /dev/sdb1. Run smartctl against the whole disk such as /dev/sdb.

  2. Unmount the affected filesystem before repairing it.
    $ sudo umount /srv/data

    When the damaged filesystem is /, /boot, or another busy production mount, stop here and continue from live or rescue media instead of forcing a repair on a mounted filesystem.

  3. Repair the ext filesystem metadata on the unmounted device.
    $ sudo fsck -f -y /dev/sdb1
    fsck from util-linux 2.39.3
    e2fsck 1.47.0 (5-Feb-2023)
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    /dev/sdb1: 12/32768 files (0.0% non-contiguous), 6354/131072 blocks

    fsck is a front end. On ext2, ext3, and ext4 it calls e2fsck automatically, and -y answers yes to repair prompts so the check can complete non-interactively.

  4. Check the parent disk health after the filesystem repair.
    $ sudo smartctl -H /dev/sdb
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED

    If smartctl reports a failing health result, growing reallocated sectors, or pending sectors, copy the data off the disk and replace the hardware instead of repeating filesystem repairs.

    Some USB bridges, RAID controllers, and guest-visible virtual disks need an explicit device type such as -d sat or expose no SMART data from inside the running system.

  5. Run a bad-block scan through e2fsck when I/O error messages continue after the first repair pass.
    $ sudo e2fsck -c -f -y /dev/sdb1
    e2fsck 1.47.0 (5-Feb-2023)
    Checking for bad blocks (read-only test): done
    /dev/sdb1: Updating bad block inode.
    Pass 1: Checking inodes, blocks, and sizes
    Pass 2: Checking directory structure
    Pass 3: Checking directory connectivity
    Pass 4: Checking reference counts
    Pass 5: Checking group summary information
    
    /dev/sdb1: ***** FILE SYSTEM WAS MODIFIED *****
    /dev/sdb1: 12/32768 files (0.0% non-contiguous), 6354/131072 blocks

    -c runs a read-only badblocks scan and records any discovered bad blocks in the ext filesystem, which is safer than trying to hand-calculate block sizes for a separate badblocks run.

    Using -c twice switches to a non-destructive read-write media test that takes much longer and puts extra load on a failing disk. Keep the single -c scan unless the extended test is intentional.

  6. Mount the repaired filesystem at its normal path or a temporary check mount.
    $ sudo mount /dev/sdb1 /mnt/repair

    Use a temporary mount such as /mnt/repair when the normal service should stay stopped until the final validation is complete.

  7. Confirm that the filesystem mounts cleanly and is readable.
    $ findmnt /mnt/repair
    TARGET      SOURCE    FSTYPE OPTIONS
    /mnt/repair /dev/sdb1 ext4   rw,relatime
    
    $ df -h /mnt/repair
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/sdb1       488M   28K  452M   1% /mnt/repair

    If the remount succeeds and dmesg -T or journalctl -k -b stays clear of fresh I/O error, Buffer I/O error, or filesystem error lines, the volume can return to service.