An offline mirror keeps a documentation site, migration source, or point-in-time site copy readable without network access. GNU wget can crawl the linked HTML and CSS tree, save page requisites such as stylesheets and images, and rewrite links so the saved entry page opens from disk.
GNU wget documents --mirror as the mirror shortcut that enables recursive retrieval and timestamping, sets unlimited recursion depth, and keeps FTP directory listings. --convert-links rewrites downloaded HTML and CSS for local browsing, --backup-converted saves original files with a .orig suffix when a file is rewritten, and --adjust-extension appends .html when the server returns HTML from an extensionless or script-shaped URL.
Keep the crawl bounded to approved hostnames and expect weaker results on pages that depend on browser-side rendering, authenticated sessions, or live APIs. Running the same mirror command again later rechecks timestamps and refreshes only the files that changed on the origin when the server provides usable freshness metadata.
$ wget --mirror --convert-links --backup-converted --adjust-extension --page-requisites http://docs.example.net/ --2026-06-06 02:21:05-- http://docs.example.net/ Resolving docs.example.net (docs.example.net)... 203.0.113.50 Connecting to docs.example.net (docs.example.net)|203.0.113.50|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 338 [text/html] Saving to: 'docs.example.net/index.html' ##### snipped ##### Saving to: 'docs.example.net/assets/site.css' Saving to: 'docs.example.net/image/logo.svg' Saving to: 'docs.example.net/docs/index.html' Saving to: 'docs.example.net/docs/overview.html' FINISHED --2026-06-06 02:21:05-- Total wall clock time: 0.007s Downloaded: 6 files, 841 in 0s (47.1 MB/s) Converting links in docs.example.net/index.html... 4. ##### snipped ##### Converted links in 4 files in 0 seconds.
--page-requisites brings in files needed to display the pages, such as stylesheets and images. If approved assets live on another host such as cdn.docs.example.net, add --span-hosts --domains=docs.example.net,cdn.docs.example.net so host spanning stays limited to the named domains.
$ find docs.example.net -name '*.orig' docs.example.net/docs/index.html.orig docs.example.net/index.html.orig
The .orig files are the pre-conversion copies for files whose links were rewritten. HTML files that already needed no link changes may not receive a .orig backup.
$ find docs.example.net -type f docs.example.net/image/logo.svg docs.example.net/robots.txt docs.example.net/index.html docs.example.net/docs/overview.html docs.example.net/docs/index.html docs.example.net/docs/index.html.orig docs.example.net/assets/site.css docs.example.net/index.html.orig
A usable mirror needs the HTML entry points and the files those pages reference, not just the first document that started the crawl. The order of this listing comes from the local filesystem; check for the expected paths, not a specific sort order.
$ cat docs.example.net/index.html
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Docs Example</title>
<link rel="stylesheet" href="assets/site.css">
</head>
<body>
<img src="image/logo.svg" alt="Docs Example">
<nav>
<a href="docs/index.html">Documentation</a>
<a href="docs/overview.html">Overview</a>
</nav>
</body>
</html>
Relative links such as docs/index.html and assets/site.css are the signal that the mirror can be browsed from disk without falling back to the remote host.