Setting a download timeout keeps a Scrapy crawl from stalling on slow or unresponsive endpoints, freeing downloader slots for healthy targets and keeping runtimes predictable.

Scrapy enforces time limits in the downloader through DownloadTimeoutMiddleware using the DOWNLOAD_TIMEOUT setting, causing a request to fail when the limit is exceeded so retries and error handling can proceed.

The default DOWNLOAD_TIMEOUT is 180 seconds, so lowering it can increase timeout failures and retries on slow sites while raising it can tie up concurrency on bad connections; mixed-latency targets can use download_timeout overrides per spider or per request instead of a single global value.

Steps to set a download timeout in Scrapy:

  1. Open the Scrapy project settings.py file.
    $ vi simplifiedguide/settings.py
  2. Set DOWNLOAD_TIMEOUT to the preferred value in seconds.
    DOWNLOAD_TIMEOUT = 20

    Default is 180 seconds, so values set too low can trigger false timeouts and excessive retries on slow targets.

  3. Set the download_timeout spider attribute to override the timeout for a single spider.
    import scrapy
     
     
    class ExampleSpider(scrapy.Spider):
        name = "example"
        download_timeout = 30

    Use a spider-level override when one spider consistently targets slower endpoints than the rest of the project.

  4. Set download_timeout in Request.meta to override the timeout for a single request.
    import scrapy
     
     
    class ExampleSpider(scrapy.Spider):
        name = "example"
     
        def start_requests(self):
            url = "https://example.net/"
            yield scrapy.Request(url, meta={"download_timeout": 10})
  5. Confirm the global timeout value loaded by Scrapy settings.
    $ scrapy settings --get DOWNLOAD_TIMEOUT
    20

    Per-spider and per-request download_timeout overrides apply at runtime and do not change the global settings value shown here.