How to check broken links on your website

Broken links interrupt the path between the pages, downloads, and references that keep a website useful. They waste crawl effort on dead URLs, send visitors into 404 pages, and quietly break conversion paths such as contact forms, policy pages, and product details.

A crawler such as LinkChecker starts at the supplied URL, follows internal links recursively, and records where each failing URL was found. Adding –check-extern extends the audit to outbound links referenced from the site without recursively crawling other domains.

The steps below assume linkchecker is already available in the shell and that the crawl starts from the public hostname for the site. Login-only areas, carts, logout URLs, and staging hosts can create noisy results, so limit the scope or ignore those paths before treating the report as a repair list.

Steps to check broken links on your website with LinkChecker:

Crawl the live site from its public root URL and include outbound links in the report.

$ linkchecker --check-extern https://www.example.com/

URL        `/missing-page/'
Name       `Missing internal page'
Parent URL https://www.example.com/, line 10, col 8
Real URL   https://www.example.com/missing-page/
Result     Error: 404 File not found

URL        `http://www.example.org/missing-resource'
Name       `Missing external link'
Parent URL https://www.example.com/, line 12, col 8
Real URL   http://www.example.org/missing-resource
Result     Error: 404 Not Found

That's it. 5 links in 5 URLs checked. 0 warnings found. 2 errors found.

LinkChecker only follows internal links by default, so keep –check-extern when the audit also needs outbound references. Use –recursion-level 3 for a quicker first pass on a very large site, and use –ignore-url for session-only paths such as login, checkout, or logout URLs that should not be audited like public content.

Group each failure by its Parent URL and repair the source page by updating the link, restoring the destination, or redirecting the old internal path to the correct replacement.

Do not send every broken internal URL to the home page. A broad home-page redirect hides the missing destination, weakens internal navigation, and makes later audits less useful.
Recheck a repaired page or section URL directly until the summary returns zero errors.
```
$ linkchecker https://www.example.com/about/

That's it. 2 links in 2 URLs checked. 0 warnings found. 0 errors found.
```
Use page-level rechecks while fixing links, then repeat the site-root crawl for the final audit pass. Prioritize same-domain 404 and 410 failures first, and recheck third-party 403, 429, or timeout results before removing the link because many external services rate-limit crawlers.

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.