Cookies let a Scrapy spider reuse a server-issued session or inject a known session value when the target site keeps account access, region choice, or consent state in cookies. Supplying the right cookie often turns a redirect loop or 403 response into a normal page fetch.

Current Scrapy releases send request cookies through CookiesMiddleware when they are passed with the Request.cookies parameter. The middleware also stores any Set-Cookie response values in the active cookie jar, and separate sessions can be isolated with the cookiejar request meta key.

Use the cookies= parameter instead of setting a raw Cookie header, because CookiesMiddleware does not treat that header as managed request cookies. Cookie values are sensitive, COOKIES_DEBUG logs them in plain text, and meta["dont_merge_cookies"]=True prevents Scrapy from sending request cookies or merging response cookies into the jar.

Steps to use cookies in Scrapy:

  1. Open the spider file that requests the protected page.
    $ vi simplifiedguide/spiders/account.py
  2. Export the session cookie value as an environment variable before running the spider.
    $ export SCRAPY_SESSIONID='abc123'

    Real session cookies grant account access, so do not commit them to source control or leave them in shell history, logs, screenshots, or shared crash output.

  3. Replace the request logic with a spider that passes the cookie through Request.cookies.
    import os
     
    import scrapy
    from scrapy.exceptions import CloseSpider
     
    class AccountSpider(scrapy.Spider):
        name = "account"
        account_url = "http://app.internal.example:8000/account"
     
        async def start(self):
            session_id = os.environ.get("SCRAPY_SESSIONID")
            if not session_id:
                raise CloseSpider("Set SCRAPY_SESSIONID before running.")
     
            yield scrapy.Request(
                self.account_url,
                cookies={"sessionid": session_id},
                meta={"cookiejar": "account"},
                callback=self.parse_account,
            )
     
        def parse_account(self, response):
            yield {
                "account_name": response.css("h1::text").get(default="").strip(),
                "url": response.url,
            }

    Use cookies=[{"name":"sessionid","value":"abc123","domain":".example.com","path":"/"}] when the target cookie needs explicit Domain or Path attributes, and copy response.meta["cookiejar"] into follow-up requests because the cookiejar key is not sticky.

  4. Run the spider with cookie debugging enabled to confirm the cookie is sent and the response is accepted.
    $ scrapy crawl account --overwrite-output account.json -s COOKIES_DEBUG=True -s LOG_LEVEL=DEBUG
    2026-04-16 05:33:22 [scrapy.downloadermiddlewares.cookies] DEBUG: Sending cookies to: <GET http://app.internal.example:8000/account>
    Cookie: sessionid=abc123
    
    2026-04-16 05:33:22 [scrapy.downloadermiddlewares.cookies] DEBUG: Received cookies from: <200 http://app.internal.example:8000/account>
    Set-Cookie: seen=1; Path=/
    2026-04-16 05:33:22 [scrapy.extensions.feedexport] INFO: Stored json feed (1 items) in: account.json

    Leave meta["dont_merge_cookies"] unset for these requests, because a true value stops Scrapy from sending request cookies and from storing response cookies in the jar.

  5. Inspect the exported items to confirm the protected page content was returned.
    $ python3 -m json.tool account.json
    [
        {
            "account_name": "Example Account",
            "url": "http://app.internal.example:8000/account"
        }
    ]