How to set concurrent requests in Scrapy

Concurrent requests control how many downloads Scrapy runs in parallel, which directly affects crawl speed, resource usage, and how much load gets placed on target sites. Concurrency tuning is a direct way to trade throughput for politeness and stability during crawls.

Scrapy uses an asynchronous downloader (via Twisted) and keeps multiple requests in flight at the same time. The CONCURRENT_REQUESTS setting caps total in-progress requests across the crawler, while CONCURRENT_REQUESTS_PER_DOMAIN limits parallelism per hostname to prevent a single site from consuming all downloader slots. The CONCURRENT_REQUESTS_PER_IP setting can cap parallelism by IP address instead, and overrides the per-domain limit when non-zero.

High concurrency can trigger throttling, temporary blocks, and more retries or timeouts when the remote side cannot keep up. Concurrency settings are typically tuned alongside DOWNLOAD_DELAY or AutoThrottle to keep the crawl responsive and reduce 429 responses.

Steps to set concurrent requests in Scrapy:

  1. Open the Scrapy project settings file.
    $ vi simplifiedguide/settings.py
  2. Set global and per-domain concurrency limits in settings.py.
    CONCURRENT_REQUESTS = 8
    CONCURRENT_REQUESTS_PER_DOMAIN = 4

    Excessive concurrency can cause throttling, increased retry rates, or IP blocks from target sites.

  3. Print the effective concurrency values from the project.
    $ scrapy settings --get CONCURRENT_REQUESTS
    8
    $ scrapy settings --get CONCURRENT_REQUESTS_PER_DOMAIN
    4
  4. Run a spider with the updated settings.
    $ scrapy crawl products
    2026-01-01 08:22:14 [scrapy.crawler] INFO: Overridden settings:
    {'BOT_NAME': 'simplifiedguide',
     'CONCURRENT_REQUESTS': 8,
     'CONCURRENT_REQUESTS_PER_DOMAIN': 4,
     'NEWSPIDER_MODULE': 'simplifiedguide.spiders',
     'SPIDER_MODULES': ['simplifiedguide.spiders']}
    ##### snipped #####