Controlling the Referer header in Scrapy helps mimic real navigation flows, satisfy endpoints that validate referrers, and keep crawling behavior consistent when a site changes responses based on the perceived source page.

The Referer request header is part of an HTTP request and typically contains the URL of the page that initiated the navigation. Scrapy can set it automatically for requests generated from a response via RefererMiddleware, while any explicit header value set on a request takes precedence for that request.

Custom referrers can expose sensitive URLs and can be used to bypass access controls, so only set them with permission and avoid copying internal or authenticated URLs into logs, exported items, or shared code samples.

Steps to set a custom Referer header in Scrapy:

  1. Set a custom Referer header on requests by providing a Referer value in the request headers.
    import scrapy
     
     
    class RefererDemoSpider(scrapy.Spider):
        name = "referer_demo"
        start_urls = ["http://app.internal.example:8000/headers?step=1"]
     
        def start_requests(self):
            for url in self.start_urls:
                yield scrapy.Request(
                    url=url,
                    headers={"Referer": "http://app.internal.example:8000/landing"},
                    callback=self.parse,
                )
     
        def parse(self, response):
            yield response.follow(
                url="http://app.internal.example:8000/headers?step=2",
                headers={"Referer": "http://app.internal.example:8000/catalog"},
                callback=self.parse_step_2,
            )
     
        def parse_step_2(self, response):
            referer = response.request.headers.get(b"Referer", b"").decode()
            self.logger.info("Sent Referer: %s", referer)
            yield {"url": response.url, "referer": referer}

    The header name is spelled Referer (one r), and an explicit Referer value prevents Scrapy from auto-populating it for that request.

  2. Add a fallback Referer header in DEFAULT_REQUEST_HEADERS when a common referrer should apply to requests that do not set one.
    DEFAULT_REQUEST_HEADERS = {
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
        "Accept-Language": "en",
        "Referer": "http://app.internal.example:8000/",
    }

    DEFAULT_REQUEST_HEADERS does not override an existing Referer already present on the request.

  3. Disable Scrapy automatic referer generation when a fixed default header must not be replaced by RefererMiddleware.
    REFERER_ENABLED = False

    Disabling RefererMiddleware stops automatic referrers for requests returned from callbacks, which can change crawl behavior on sites that rely on navigation context.

  4. Run the spider from the project directory.
    $ scrapy crawl referer_demo -s LOG_LEVEL=INFO -s HTTPCACHE_ENABLED=False
  5. Confirm the log output contains the intended Referer value.
    2026-01-01 08:48:33 [referer_demo] INFO: Sent Referer: http://app.internal.example:8000/catalog
    2026-01-01 08:48:33 [scrapy.core.engine] INFO: Spider closed (finished)