Repeated downloads from the same URL are routine in scripted backups, mirrored websites, and scheduled synchronization jobs. When an existing file is silently replaced by a truncated or corrupted download, a reliable dataset can disappear without an obvious error. Protecting local files from unintended overwrites keeps automation idempotent and preserves known-good artifacts for later verification or rollback.

GNU wget controls filename collisions through options such as –no-clobber (-nc), –timestamping (-N), and recursive retrieval flags like -r and -p. Without these options, single-file downloads typically keep the original and create numbered siblings like index.html.1, while recursive mirroring and explicit output paths can replace existing content when the same URLs are revisited in the same directory tree. Understanding how these switches interact makes it possible to decide when to skip, update, or safely replace local files.

On Linux, this behavior comes from the GNU wget implementation distributed by common package repositories, so command-line options are consistent across recent releases. The –no-clobber flag refuses to download a resource when a file with the same name is already present, whereas –timestamping only retrieves newer remote copies based on modification time and size, and the two options cannot be combined. Additional care is required when using -O to force a specific output filename or when network failures occur mid-transfer, so safety usually depends on conservative flags, temporary files, and explicit post-download checks instead of a single magic option.

Steps to prevent overwriting existing files with wget:

  1. Change into the directory where downloaded files should be stored.
    $ cd ~/Downloads
    $ pwd
    /home/example/Downloads
  2. Display the first lines of the wget version banner to confirm the available options and GNU build.
    $ wget --version | head -n 2
    GNU Wget 1.21.2 built on linux-gnu.
    Copyright (C) 2011 Free Software Foundation, Inc.
    ##### snipped #####

    Version output confirms that the expected GNU wget implementation is in use and allows cross-checking behavior with the installed manual page.

  3. Fetch a test file once so a local copy exists for later checks.
    $ wget https://example.com/index.html
    --2025-12-08 10:00:00--  https://example.com/index.html
    Resolving example.com (example.com)... 93.184.216.34
    Connecting to example.com (example.com)|93.184.216.34|:443... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 1256 [text/html]
    Saving to: ‘index.html’
    
    index.html          100%[===================>]   1.23K  --.-KB/s    in 0s
    
    2025-12-08 10:00:00 (200 MB/s) - ‘index.html’ saved [1256/1256]
  4. Repeat the download with –no-clobber enabled so the existing file is skipped instead of overwritten.
    $ wget --no-clobber https://example.com/index.html
    File ‘index.html’ already there; not retrieving.

    The –no-clobber (-nc) option refuses to download when a file with the same name already exists, which keeps idempotent scripts from replacing known-good data with unexpected content.

  5. Mirror a remote documentation tree while keeping previously downloaded files by combining recursive options with –no-clobber.
    $ wget --recursive --no-parent --no-clobber https://example.com/docs/
    --2025-12-08 10:02:00--  https://example.com/docs/
    ##### snipped #####
    File ‘docs/guide.html’ already there; not retrieving.
    File ‘docs/api.html’ already there; not retrieving.
    ##### snipped #####

    Recursive downloads with -r or -p without –no-clobber may replace existing files inside the same directory tree, which can turn a temporary network error into a partially updated mirror.

  6. Use –timestamping when updates to individual files should occur only if the remote copy is newer rather than on every run.
    $ wget --timestamping https://example.com/archive.tar.gz
    --2025-12-08 10:05:00--  https://example.com/archive.tar.gz
    Resolving example.com (example.com)... 93.184.216.34
    Connecting to example.com (example.com)|93.184.216.34|:443... connected.
    HTTP request sent, awaiting response... 304 Not Modified
    File ‘archive.tar.gz’ not modified on server. Exiting.

    The –timestamping (-N) option compares remote modification times with the local file and only replaces older copies; it is intentionally incompatible with –no-clobber and must be used alone when selective updates are desired.

  7. Create a small shell script that downloads to a temporary file before replacing a fixed output path.
    $ cat > download-safe.sh <<'EOF'
    #!/usr/bin/env bash
    set -euo pipefail
    
    url="$1"
    target="$2"
    
    tmp="$(mktemp "${target}.XXXXXX")"
    if wget --no-verbose --output-document="$tmp" "$url"; then
      mv -f "$tmp" "$target"
    else
      printf 'Download failed; preserving existing file: %s\n' "$target" >&2
      rm -f "$tmp"
      exit 1
    fi
    EOF

    Writing to a temporary file and performing a final mv only after success prevents truncation of the target file when servers return errors or connections drop mid-transfer while using -O.

  8. Grant execute permission on the wrapper script.
    $ chmod +x download-safe.sh
  9. Call the wrapper script with a URL and target filename to perform a safe conditional overwrite.
    $ ./download-safe.sh "https://example.com/index.html" index.html
    2025-12-08 10:06:30 URL:https://example.com/index.html [1256/1256] -> "index.html.ABC123" [1]

    Forcing output with -O directly in wget replaces the target file as data streams in, so failures can leave a zero-length or partial file unless a wrapper script or similar pattern is used.

  10. Inspect the modification time and size of the target file before and after protected downloads to verify that unintended overwrites did not occur.
    $ stat -c '%y %s' index.html
    2025-12-08 10:00:00.000000000 +0000 1256
    $ ./download-safe.sh "https://example.com/index.html" index.html
    2025-12-08 10:07:00 URL:https://example.com/index.html [1256/1256] -> "index.html.DEF456" [1]
    $ stat -c '%y %s' index.html
    2025-12-08 10:07:00.000000000 +0000 1256

    Unchanged or intentionally updated size, expected timestamps, and messages such as “already there; not retrieving” or “not modified” indicate that wget has respected existing data and avoided accidental overwrites.

Discuss the article:

Comment anonymously. Login not required.