Setting a custom Referer header in Scrapy helps reproduce browser-like navigation, satisfy endpoints that only respond when a request appears to come from a specific page, and keep a crawl stable when the upstream site checks request origin before returning data.

Scrapy can send Referer in three different ways: an explicit value in a Request headers mapping, a project-wide fallback in DEFAULT_REQUEST_HEADERS, or an automatic value added by RefererMiddleware when a new request is created from a response. Requests created with response.follow() use the current response URL unless a manual header is already present.

A fixed referrer can expose internal paths, tokens, or authenticated URLs if it is copied blindly. The current default referrer policy in Scrapy can also send non-empty cross-domain referrers on HTTP and HTTPS requests, so tighten REFERRER_POLICY when a full cross-site URL is not appropriate and keep secrets out of code samples, logs, and exports.

Steps to set a custom Referer header in Scrapy:

  1. Open the spider file that creates the requests.
    $ vi tutorial/spiders/referer_demo.py
  2. Add explicit Referer headers to the requests that need a fixed value.
    import json
    import scrapy
     
    class RefererDemoSpider(scrapy.Spider):
        name = "referer_demo"
        start_urls = ["http://app.internal.example:8000/headers?step=start"]
     
        def start_requests(self):
            for url in self.start_urls:
                yield scrapy.Request(
                    url,
                    headers={"Referer": "http://app.internal.example:8000/landing"},
                    callback=self.parse_start,
                )
     
        def parse_start(self, response):
            payload = json.loads(response.text)
            self.logger.info("start Referer: %s", payload["headers"].get("Referer"))
            yield response.follow(
                url="http://app.internal.example:8000/headers?step=follow",
                headers={"Referer": "http://app.internal.example:8000/catalog"},
                callback=self.parse_follow,
            )
     
        def parse_follow(self, response):
            payload = json.loads(response.text)
            self.logger.info("follow Referer: %s", payload["headers"].get("Referer"))
            yield {
                "url": response.url,
                "referer": payload["headers"].get("Referer"),
            }

    Header names are case-insensitive, but the wire-level header is still spelled Referer with one r.

  3. Set a project-wide fallback Referer in settings.py for requests that do not define one explicitly.
    DEFAULT_REQUEST_HEADERS = {
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
        "Accept-Language": "en",
        "Referer": "http://app.internal.example:8000/default-referer",
    }

    Per-request headers values override matching keys from DEFAULT_REQUEST_HEADERS while leaving other default headers in place.

  4. Disable automatic referer generation only when callback-generated requests must keep that project default instead of the parent response URL.
    REFERER_ENABLED = False

    With REFERER_ENABLED left on, response.follow() populates Referer before DEFAULT_REQUEST_HEADERS are merged, so a project default Referer mainly acts as a fallback for start requests or requests without a parent response.

  5. Run the spider from the project directory.
    $ scrapy crawl referer_demo -s LOG_LEVEL=INFO -s HTTPCACHE_ENABLED=False
  6. Confirm the log shows the exact Referer values that were sent.
    2026-04-16 12:18:09 [referer_demo] INFO: start Referer: http://app.internal.example:8000/landing
    2026-04-16 12:18:09 [referer_demo] INFO: follow Referer: http://app.internal.example:8000/catalog
    2026-04-16 12:18:09 [scrapy.core.engine] INFO: Spider closed (finished)

    Keep RefererMiddleware enabled and set REFERRER_POLICY = “same-origin” or request meta["referrer_policy"] when automatic referrers are still useful but full cross-domain URLs should not be sent.