Request meta keeps request-scoped state attached to a Scrapy request so a later callback can still tell which page, crawl branch, or request setting produced that response. It fits detail-page crawls where a followed request needs context from the listing page without moving that state into spider-level globals.
When a spider yields scrapy.Request() or response.follow() with a meta dictionary, Scrapy exposes the same state in the later callback through response.meta. Current Scrapy documentation also notes that response.meta is propagated across redirects and retries, which makes it useful for values such as a source URL, a trace label, or request-specific component keys like cookiejar and download_timeout.
Keep meta narrow and owned by the new request. Current Scrapy guidance prefers cb_kwargs for values that only need to become callback arguments, and it warns against copying an entire response.meta dictionary into unrelated follow-up requests because Scrapy stores internal keys there, including retry bookkeeping.
Related: How to use request callbacks in Scrapy
Related: How to use cookies in Scrapy
$ cd catalogdemo
$ vi catalogdemo/spiders/catalog.py
import scrapy class CatalogSpider(scrapy.Spider): name = "catalog" allowed_domains = ["books.toscrape.com"] start_urls = ["https://books.toscrape.com/catalogue/page-1.html"] def parse(self, response): for book in response.css("article.product_pod")[:2]: detail_href = book.css("h3 a::attr(href)").get() source_title = book.css("h3 a::attr(title)").get(default="").strip() list_price = book.css("p.price_color::text").get(default="").strip() if detail_href: yield response.follow( detail_href, callback=self.parse_detail, cb_kwargs={ "source_title": source_title, "list_price": list_price, }, meta={"source_url": response.url}, ) def parse_detail(self, response, source_title, list_price): yield { "source_title": source_title, "list_price": list_price, "source_url": response.meta["source_url"], "detail_title": response.css("div.product_main h1::text").get(default="").strip(), "upc": response.css("table tr:nth-child(1) td::text").get(default="").strip(), "detail_url": response.url, }
source_url stays on the request through meta, while source_title and list_price stay in cb_kwargs because only the callback needs them.
yield response.follow( next_href, callback=self.parse_more, meta={"source_url": response.meta["source_url"]}, cb_kwargs={"source_title": source_title}, )
Do not pass meta=response.meta in ordinary spider callbacks, because Scrapy components store internal keys there that should not leak into unrelated follow-up requests.
$ scrapy crawl catalog -O meta-items.json 2026-04-22 06:39:10 [scrapy.utils.log] INFO: Scrapy 2.15.0 started (bot: catalogdemo) 2026-04-22 06:39:13 [scrapy.core.engine] INFO: Spider opened ##### snipped ##### 2026-04-22 06:39:17 [scrapy.extensions.feedexport] INFO: Stored json feed (2 items) in: meta-items.json 2026-04-22 06:39:17 [scrapy.core.engine] INFO: Spider closed (finished)
-O replaces the previous export file so the verification step only shows the current crawl.
$ python3 -m json.tool meta-items.json
[
{
"source_title": "Tipping the Velvet",
"list_price": "£53.74",
"source_url": "https://books.toscrape.com/catalogue/page-1.html",
"detail_title": "Tipping the Velvet",
"upc": "90fa61229261140a",
"detail_url": "https://books.toscrape.com/catalogue/tipping-the-velvet_999/index.html"
},
##### snipped #####
]
If source_url is missing while source_title and list_price still appear, the callback kept its cb_kwargs but the request did not keep the expected meta key.