How to use request callbacks in Scrapy

Request callbacks decide which spider method handles each downloaded page, so listing pages, detail pages, and follow-up pages do not all fall back into one generic parsing path. That keeps extraction logic readable when one response needs to schedule another response with a different parser.

In current Scrapy releases, both scrapy.Request() and response.follow() accept a callback argument that points to the method that should receive the downloaded Response. If callback is omitted, Scrapy sends the response to parse(), and cb_kwargs passes extra keyword arguments into the chosen callback without turning them into request metadata.

Callback wiring usually fails in three places: the followed selector returns no URL, the follow-up request forgets to set the intended callback, or the callback signature does not accept response first and the declared cb_kwargs after it. Keep callback-only values in cb_kwargs, and use meta only when request state must stay attached to the request itself for later callbacks or middleware.

Steps to use request callbacks in Scrapy:

Change to the root of the Scrapy project that contains the spider.
```
$ cd callbackdemo
```

Replace the spider with one callback for the quotes listing page and a second callback for the followed author page.

$ vi callbackdemo/spiders/quotes_callback.py

import scrapy
 
 
class QuotesCallbackSpider(scrapy.Spider):
    name = "quotes_callback"
    allowed_domains = ["quotes.toscrape.com"]
    start_urls = ["http://quotes.toscrape.com/"]
 
    def parse(self, response):
        for quote_card in response.css("div.quote"):
            author_href = quote_card.css("span a::attr(href)").get()
            quote_text = quote_card.css("span.text::text").get(default="").strip()
 
            if author_href:
                yield response.follow(
                    author_href,
                    callback=self.parse_author,
                    cb_kwargs={"quote_text": quote_text},
                )
 
    def parse_author(self, response, quote_text):
        yield {
            "quote": quote_text,
            "author": response.css("h3.author-title::text").get(default="").strip(),
            "born": response.css("span.author-born-date::text").get(default="").strip(),
            "url": response.url,
        }

response.follow() resolves the relative author URL against the current response automatically, while cb_kwargs carries the quote text into parse_author() without copying request metadata.

Keep response as the first parameter in parse_author() and match every cb_kwargs key in the method signature, or Scrapy raises a callback argument error when the followed response arrives.

Run the spider and overwrite the previous export file for the current callback test.

$ scrapy crawl quotes_callback -O authors.json
2026-04-22 06:37:14 [scrapy.utils.log] INFO: Scrapy 2.15.0 started (bot: callbackdemo)
2026-04-22 06:37:20 [scrapy.core.engine] INFO: Spider opened
##### snipped #####
2026-04-22 06:37:41 [scrapy.extensions.feedexport] INFO: Stored json feed (8 items) in: authors.json
2026-04-22 06:37:41 [scrapy.core.engine] INFO: Spider closed (finished)

-O overwrites the previous feed file on each run, which keeps callback checks aligned with the current spider code.

Print the exported items to confirm the author responses were handled by parse_author() and the original quote text survived through cb_kwargs.

$ cat authors.json
[
{"quote": "“A day without sunshine is like, you know, night.”", "author": "Steve Martin", "born": "August 14, 1945", "url": "http://quotes.toscrape.com/author/Steve-Martin/"},
{"quote": "“A woman is like a tea bag; you never know how strong it is until it's in hot water.”", "author": "Eleanor Roosevelt", "born": "October 11, 1884", "url": "http://quotes.toscrape.com/author/Eleanor-Roosevelt/"},
{"quote": "“I have not failed. I've just found 10,000 ways that won't work.”", "author": "Thomas A. Edison", "born": "February 11, 1847", "url": "http://quotes.toscrape.com/author/Thomas-A-Edison/"},
##### snipped #####
]

If the export is empty or the born field never appears, the followed request is not reaching parse_author() or the detail-page selectors no longer match the author page.

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.