Broken links send visitors to missing pages, weaken internal navigation, and leave search crawlers spending time on dead URLs. A useful check needs to show both the failing destination and the page that linked to it so the repair happens in the right menu, template, or content page.
LinkChecker is a recursive site crawler, so one command can walk the live site and report each broken destination together with the linking source page. Internal links are checked by default, and –check-extern adds outbound links to the same report. The Parent URL field shows where the broken link was found, and Real URL shows the target that failed.
LinkChecker follows robots.txt, so blocked paths can disappear from the crawl by design. Large sites can also take time because recursion is unlimited by default, and session-only paths such as login, logout, cart, or checkout URLs often add noisy failures that do not belong on a public repair list. Exclude those patterns before treating the results as the site's real broken-link backlog.
Steps to check broken links on your website with LinkChecker:
- Crawl the public site root with external checks enabled and ignore session-only paths that should not be audited with normal content pages.
$ linkchecker --check-extern --ignore-url=/login --ignore-url=/logout https://www.example.com/ URL `/missing-page/' Name `Missing internal page' Parent URL https://www.example.com/, line 10, col 6 Real URL https://www.example.com/missing-page/ Result Error: 404 File not found URL `https://example.org/missing-resource' Name `Missing external link' Parent URL https://www.example.com/, line 13, col 6 Real URL https://example.org/missing-resource Result Error: 404 Not Found That's it. 6 links in 6 URLs checked. 0 warnings found. 2 errors found.
Without –check-extern, LinkChecker checks internal URLs only. Add more –ignore-url patterns for cart, checkout, account, or logout paths when those URLs are expected to fail outside a live session, and use –recursion-level 3 for a quick first pass on a large site.
- Repair the page or shared template named in Parent URL, then either restore the missing destination, update the link target, or remove the dead reference.
Do not redirect every missing internal URL to the home page. A broad catch-all redirect hides the real defect, weakens navigation signals, and makes later audits harder to trust.
- Rerun the affected section or the full site until the summary returns zero errors for the scope being closed.
$ linkchecker --check-extern https://www.example.com/about/ That's it. 2 links in 2 URLs checked. 0 warnings found. 0 errors found.
LinkChecker returns exit status 1 when invalid links are found, or when warnings are enabled and a warning occurs, so scheduled checks and CI jobs should treat a non-zero result as a failed audit.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
