Large files are common causes of sudden disk usage spikes, and identifying the worst offenders early helps target cleanup before a filesystem goes read-only.
Linux filesystems store file metadata (size, timestamps, permissions) that the find command can query while traversing a directory tree. Pairing size filters such as -size with structured output from -printf produces a sortable list of candidate files without relying on interactive tools.
Size scans can take time on large trees, and results often include files actively used by services (databases, containers, virtual machines, and application logs). Keeping searches scoped to a single filesystem with -xdev avoids crossing into other mounts, and using -mtime focuses on recently modified files rather than creation time.
Steps to find the largest files with find in Linux:
- List the largest files under a target path such as /var without crossing into other filesystems.
$ sudo find /var -xdev -type f -size +100M -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 8 314572800 /var/log/nginx/large.log
+100M matches files larger than 100 MiB. Adjust the threshold (for example +1G). Rerun on the mountpoint that is filling up.
- Reformat the size column into human-readable units for quicker triage.
$ sudo find /var -xdev -type f -size +100M -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 8 | numfmt --field=1 --to=iec-i --suffix=B 300MiB /var/log/nginx/large.log
numfmt comes from GNU coreutils. Omit the final pipe to keep raw byte counts for exact comparisons.
- Scan the root filesystem when the full partition is unknown while staying on a single mount.
$ sudo find / -xdev -type f -size +250M -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 8
-xdev skips other mounts (such as /home, /var, /proc, and network shares). Repeat the scan on other mountpoints when they are separate filesystems.
- Search for recently modified large files when disk usage grew in the last day.
$ sudo find / -xdev -type f -mtime -1 -size +50M -printf '%TY-%Tm-%Td %TH:%TM %s %p\n' 2>/dev/null | sort -k3,3nr | head -n 8
-mtime -1 filters by modification time (last 24 hours). Use -mmin for finer windows (for example -mmin -60).
- Confirm ownership, permissions, and timestamps for a candidate file before cleanup.
$ ls -lh --time-style=long-iso /var/log/nginx/large.log -rw-r--r-- 1 root root 300M 2026-01-10 12:15 /var/log/nginx/large.log
Removing files owned by active services (databases, containers, VM images, and logs) can cause data loss or outages. Prefer application-specific cleanup methods when unsure.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
