How to download a directory recursively with wget

Recursive retrieval is useful when one published directory needs to be copied locally with its files still arranged in a usable tree. That makes it practical to review exports offline, pass the result to another tool, or preserve one bounded part of a site without mirroring everything.

GNU wget follows links from the starting directory page when -r is enabled. -np blocks climbs above the starting path, and -l sets how many link levels wget may follow, so -l 1 fits one directory while -l 2 reaches files inside its first nested subdirectories.

The starting URL should end with a trailing slash so -np treats it as a directory boundary, and recursive runs still fetch the HTML listing page and robots.txt while discovering links. Raise the level only as far as the remote tree needs, and add request delays before pointing the same pattern at a shared or rate-limited host.

Steps to download a directory recursively with wget:

Run wget against the directory URL with recursion deep enough to reach the nested files that belong in the local copy.

$ wget -r -np -l 2 -P mirror https://archive.example.net/exports/records/
--2026-04-22 06:24:45--  https://archive.example.net/exports/records/
Resolving archive.example.net (archive.example.net)... 203.0.113.50
Connecting to archive.example.net (archive.example.net)|203.0.113.50|:443... connected.
HTTP request sent, awaiting response... 200 OK
Saving to: 'mirror/archive.example.net/exports/records/index.html'

##### snipped #####

Saving to: 'mirror/archive.example.net/robots.txt'
Saving to: 'mirror/archive.example.net/exports/records/reports/daily-summary.csv'
Saving to: 'mirror/archive.example.net/exports/records/reports/monthly-summary.csv'
Saving to: 'mirror/archive.example.net/exports/records/assets/storage-trend.png'

FINISHED --2026-04-22 06:24:46--
Downloaded: 5 files, 34K in 0.02s (1.68 MB/s)

-l 1 is enough when the starting directory page links directly to every file that needs to be saved, while -l 2 is the practical next step when the listing links to immediate subdirectories first.

Remove the host directory and leading path segments when the local tree should start at the remote directory itself.
```
$ wget -r -np -l 2 -nH --cut-dirs=2 -P mirror https://archive.example.net/exports/records/
```
-nH drops archive.example.net from the local path, and –cut-dirs=2 strips /exports/records/ so the saved tree starts directly under /mirror/.
Add an accept list when the directory listing contains extra formats that do not belong in the local copy.
```
$ wget -r -np -l 2 -nH --cut-dirs=2 -A csv,png -P mirror https://archive.example.net/exports/records/
```
-A filters the saved files by suffix, but wget still downloads the listing page and robots.txt long enough to discover links before removing local files that do not match the accept list.
Add pacing before running the same recursive pattern against a shared or rate-limited origin.
```
$ wget -r -np -l 2 -nH --cut-dirs=2 --wait=2 --random-wait -P mirror https://archive.example.net/exports/records/
```
Related: How to add wait and random delays between wget requests
List the downloaded files and confirm nothing above the target directory was pulled into the local tree.
```
$ find mirror -type f
mirror/assets/storage-trend.png
mirror/reports/daily-summary.csv
mirror/reports/monthly-summary.csv
```
The absence of parent-path content is the signal that -np kept retrieval inside the intended directory boundary.

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.