HDFS snapshots preserve a point-in-time view of a directory without copying the full dataset. They are useful before application releases, bulk updates, and cleanup jobs where rollback needs a namespace-level reference.
A directory must be marked snapshottable before a snapshot can be created. Snapshot names should identify the change or date so operators can find the right restore point later.
Snapshots protect namespace references, not every operational risk. They do not replace off-cluster backups or capacity planning for retained data.
Related: How to create a Hadoop NameNode checkpoint
Related: How to download a file from HDFS
$ hdfs dfs -ls /data/events Found 2 items drwxr-x--- - alice analytics 0 2026-06-17 03:00 /data/events/day=2026-06-16 drwxr-x--- - alice analytics 0 2026-06-17 03:00 /data/events/day=2026-06-17
$ hdfs dfsadmin -allowSnapshot /data/events Allowing snapshot on /data/events succeeded
$ hdfs dfs -createSnapshot /data/events before-retention-change Created snapshot /data/events/.snapshot/before-retention-change
$ hdfs dfs -ls /data/events/.snapshot/before-retention-change Found 2 items drwxr-x--- - alice analytics 0 2026-06-17 03:00 /data/events/.snapshot/before-retention-change/day=2026-06-16 drwxr-x--- - alice analytics 0 2026-06-17 03:00 /data/events/.snapshot/before-retention-change/day=2026-06-17
$ hdfs snapshotDiff /data/events before-retention-change after-retention-change Difference between snapshot before-retention-change and snapshot after-retention-change under directory /data/events: M . - ./day=2026-05-01