AutoThrottle keeps a Scrapy crawl responsive when a target site speeds up or slows down, reducing the chance of accidental overload, bans, or timeouts.
The AutoThrottle extension measures download latency and dynamically adjusts per-host delays to keep request pressure aligned with real-world response times. Rather than relying on a fixed DOWNLOAD_DELAY, it continuously recalculates delays for each download slot (typically per domain) to reach a configured target concurrency.
AutoThrottle changes delays, not hard concurrency limits, so aggressive concurrency settings can still generate heavy load even when delays increase. Keep AUTOTHROTTLE_TARGET_CONCURRENCY conservative for unknown or fragile endpoints, and pair throttling with explicit concurrency limits when predictable behavior matters.
Steps to enable AutoThrottle in Scrapy:
- Open the Scrapy project settings file.
$ vi simplifiedguide/settings.py
- Set AutoThrottle settings in the project configuration.
AUTOTHROTTLE_ENABLED = True AUTOTHROTTLE_START_DELAY = 1.0 AUTOTHROTTLE_MAX_DELAY = 60.0 AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0
AUTOTHROTTLE_START_DELAY sets the initial per-host delay, AUTOTHROTTLE_MAX_DELAY caps backoff for slow responses, and AUTOTHROTTLE_TARGET_CONCURRENCY controls the intended average in-flight requests per host.
Raising AUTOTHROTTLE_TARGET_CONCURRENCY increases pressure on a single host and can trigger rate limiting or blocks.
- Run the spider with AutoThrottle debug output enabled.
$ scrapy crawl products -s AUTOTHROTTLE_DEBUG=True -s LOG_LEVEL=DEBUG 2026-01-01 08:24:28 [scrapy.extensions.throttle] INFO: slot: app.internal.example | conc: 1 | delay: 2000 ms (+0) | latency: 2 ms | size: 34 bytes 2026-01-01 08:24:30 [scrapy.extensions.throttle] INFO: slot: app.internal.example | conc: 1 | delay: 2000 ms (+0) | latency: 3 ms | size: 595 bytes ##### snipped #####
Using -s overrides settings for the current run without changing settings.py.
AutoThrottle debug lines can grow logs quickly on large crawls.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
