Cookies let a Scrapy spider reuse a server-issued session or inject a known session value when the target site keeps account access, region choice, or consent state in cookies. Supplying the right cookie often turns a redirect loop or 403 response into a normal page fetch.
Current Scrapy releases send request cookies through CookiesMiddleware when they are passed with the Request.cookies parameter. The middleware also stores any Set-Cookie response values in the active cookie jar, and separate sessions can be isolated with the cookiejar request meta key.
Use the cookies= parameter instead of setting a raw Cookie header, because CookiesMiddleware does not treat that header as managed request cookies. Cookie values are sensitive, COOKIES_DEBUG logs them in plain text, and meta["dont_merge_cookies"]=True prevents Scrapy from sending request cookies or merging response cookies into the jar.
Steps to use cookies in Scrapy:
- Open the spider file that requests the protected page.
$ vi simplifiedguide/spiders/account.py
- Export the session cookie value as an environment variable before running the spider.
$ export SCRAPY_SESSIONID='abc123'
Real session cookies grant account access, so do not commit them to source control or leave them in shell history, logs, screenshots, or shared crash output.
- Replace the request logic with a spider that passes the cookie through Request.cookies.
import os import scrapy from scrapy.exceptions import CloseSpider class AccountSpider(scrapy.Spider): name = "account" account_url = "http://app.internal.example:8000/account" async def start(self): session_id = os.environ.get("SCRAPY_SESSIONID") if not session_id: raise CloseSpider("Set SCRAPY_SESSIONID before running.") yield scrapy.Request( self.account_url, cookies={"sessionid": session_id}, meta={"cookiejar": "account"}, callback=self.parse_account, ) def parse_account(self, response): yield { "account_name": response.css("h1::text").get(default="").strip(), "url": response.url, }
Use cookies=[{"name":"sessionid","value":"abc123","domain":".example.com","path":"/"}] when the target cookie needs explicit Domain or Path attributes, and copy response.meta["cookiejar"] into follow-up requests because the cookiejar key is not sticky.
- Run the spider with cookie debugging enabled to confirm the cookie is sent and the response is accepted.
$ scrapy crawl account --overwrite-output account.json -s COOKIES_DEBUG=True -s LOG_LEVEL=DEBUG 2026-04-16 05:33:22 [scrapy.downloadermiddlewares.cookies] DEBUG: Sending cookies to: <GET http://app.internal.example:8000/account> Cookie: sessionid=abc123 2026-04-16 05:33:22 [scrapy.downloadermiddlewares.cookies] DEBUG: Received cookies from: <200 http://app.internal.example:8000/account> Set-Cookie: seen=1; Path=/ 2026-04-16 05:33:22 [scrapy.extensions.feedexport] INFO: Stored json feed (1 items) in: account.json
Leave meta["dont_merge_cookies"] unset for these requests, because a true value stops Scrapy from sending request cookies and from storing response cookies in the jar.
- Inspect the exported items to confirm the protected page content was returned.
$ python3 -m json.tool account.json [ { "account_name": "Example Account", "url": "http://app.internal.example:8000/account" } ]
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
