Per-spider settings keep a single Scrapy project workable across targets with different rate limits, authentication requirements, or anti-bot sensitivity. Isolating crawl behavior per spider avoids project-wide tweaks that accidentally change every crawl.

A spider can define a custom_settings dictionary as a class attribute. When a crawl starts, Scrapy applies those values to the crawler settings for that spider instance, overriding project defaults for the duration of that run.

Overriding too many values makes runs inconsistent and harder to compare between spiders. Aggressive concurrency or minimal delays can trigger rate limiting or blocks, so keep overrides small, track changes in version control, and validate the startup log before leaving a crawl unattended.

Steps to use per-spider settings in Scrapy:

  1. Open the spider file that needs custom settings.
    $ vi catalog_demo/spiders/catalog.py
  2. Add a custom_settings dictionary on the spider class.
    import scrapy
     
    class CatalogSpider(scrapy.Spider):
        name = "catalog"
        allowed_domains = ["app.internal.example"]
        custom_settings = {
            "DOWNLOAD_DELAY": 1.0,
            "CONCURRENT_REQUESTS": 8,
            "LOG_LEVEL": "INFO",
        }
        start_urls = ["http://app.internal.example:8000/products/"]

    Common per-target overrides include DOWNLOAD_DELAY, CONCURRENT_REQUESTS, AUTOTHROTTLE_ENABLED, HTTPCACHE_ENABLED, and USER_AGENT.

    High CONCURRENT_REQUESTS or low DOWNLOAD_DELAY can trigger rate limiting or blocks on sensitive targets.

  3. Run the spider by its name value.
    $ scrapy crawl catalog
  4. Confirm the crawl starts with the expected Overridden settings line.
    2026-01-01 09:38:57 [scrapy.crawler] INFO: Overridden settings:
    {'BOT_NAME': 'catalog_demo',
     'CONCURRENT_REQUESTS': 8,
     'DOWNLOAD_DELAY': 1.0,
     'LOG_LEVEL': 'INFO',
    ##### snipped #####

    Missing Overridden settings usually indicates custom_settings is not defined on the spider class.