Setting a custom Referer header in Scrapy makes a request look like it came from a specific page. Some sites check that field before returning normal content, API data, or a secondary request flow, so sending the expected referrer can keep the crawl on the path the endpoint already accepts.

In current Scrapy releases, the direct way to do that is to put Referer in the request headers mapping on the exact Request object being yielded. The same pattern works on both scrapy.Request() and response.follow(), and a manual value stays in place because DefaultHeadersMiddleware and RefererMiddleware only fill missing headers.

A fixed referrer can leak full paths, search terms, or signed URLs if it is copied from live traffic without review. When a request should inherit the parent page instead, leave the header unset and let RefererMiddleware apply REFERRER_POLICY, because the default policy can still send a non-empty cross-origin referrer on HTTP and HTTPS requests.

Steps to set a custom Referer header in Scrapy:

  1. Open the spider file that yields the requests.
    $ vi referer_demo.py
  2. Set Referer in the headers mapping when the initial scrapy.Request() must send a fixed parent page value.
    ref = "https://app.example/cat"
    yield scrapy.Request(
        url,
        headers={
            "Referer": ref,
        },
        callback=self.parse,
    )

    Use the exact header name Referer with that historical spelling; on older spiders that still override start_requests(), add the same headers={…} mapping to the yielded Request objects there.

  3. Set Referer in the headers mapping on response.follow() when a follow-up request must not inherit the parent response URL.
    ref = "https://app.example/list"
    yield response.follow(
        next_url,
        headers={
            "Referer": ref,
        },
        callback=self.parse_detail,
    )
  4. Open the Scrapy shell from the project directory.
    $ scrapy shell --nolog
  5. Fetch a request with the fixed Referer value and inspect the header that the endpoint received.
    >>> import json, scrapy
    >>> echo = "https://httpbin.org/get"
    >>> ref = "https://app.example/cat"
    >>> req = scrapy.Request(
    ...     echo,
    ...     headers={"Referer": ref},
    ... )
    >>> fetch(req)
    >>> data = json.loads(response.text)
    >>> headers = data["headers"]
    >>> headers["Referer"]
    'https://app.example/cat'
  6. Build a follow-up request with its own Referer value when the child request must override the parent page URL.
    >>> ref = "https://app.example/list"
    >>> req = response.follow(
    ...     echo,
    ...     headers={"Referer": ref},
    ... )
    >>> fetch(req)
    >>> data = json.loads(response.text)
    >>> headers = data["headers"]
    >>> headers["Referer"]
    'https://app.example/list'

    If response.follow() receives no manual Referer header, RefererMiddleware uses the parent response URL instead.

  7. Replace the echo endpoint and sample referrer values with the real target after the header behavior matches what the site expects.

    Do not hard-code authenticated URLs, private paths, or signed query strings into a fixed Referer header, because upstream servers and local logs may record them.