How to get stock chart data with Scrapy

Stock chart data gives Scrapy a clean way to collect OHLCV bars for candlestick charts, indicators, and backtests without scraping rendered tables. Exporting one row per bar keeps the result easy to reload into Python, spreadsheets, or charting tools.

Most market-data providers expose chart history through a JSON endpoint keyed by symbol, interval, and date range. A Scrapy spider can request that endpoint, read the returned bar array, and send the rows straight to a .csv feed so the crawl produces a reusable time-series file in one run.

Providers differ in auth headers, array names, adjusted-price rules, and whether timestamps arrive as ISO dates or Unix time, so the request URL and parse logic should match the current API contract before you reuse the crawl. Current Scrapy releases use async def start() for initial requests, while projects that still target releases older than 2.13 should also keep a start_requests() version for compatibility.

Steps to get stock chart data with Scrapy:

Probe the chart endpoint with one symbol and a short date range.

$ curl -s 'https://data.example.net/v1/charts?symbol=MSFT&interval=1d&start=2026-04-01&end=2026-04-03'
{
  "symbol": "MSFT",
  "interval": "1d",
  "bars": [
    {
      "date": "2026-04-01",
      "open": 381.1,
      "high": 384.5,
      "low": 380.6,
      "close": 383.9,
      "volume": 20311452
    },
    {
      "date": "2026-04-02",
      "open": 384.0,
      "high": 386.4,
      "low": 382.3,
      "close": 385.7,
      "volume": 18744106
    },
    {
      "date": "2026-04-03",
      "open": 385.9,
      "high": 388.1,
      "low": 384.8,
      "close": 387.6,
      "volume": 19180234
    }
  ]
}

Confirm whether the response uses bars, candles, or another array name, whether the date is already text or a Unix timestamp, and whether the endpoint serves adjusted or raw prices.

Create a new Scrapy project for the chart spider.

$ scrapy startproject market_chart
New Scrapy project 'market_chart', using template directory '##### snipped #####', created in:
    /srv/market_chart

You can start your first spider with:
    cd market_chart
    scrapy genspider example example.com

Change into the project directory.
```
$ cd market_chart
```

Generate a spider for the market data host.

$ scrapy genspider chart data.example.net
Created spider 'chart' using template 'basic' in module:
  market_chart.spiders.chart

Replace the generated spider with a request that reads the chart JSON and yields one item per bar.

market_chart/spiders/chart.py

from datetime import date
import os
from urllib.parse import urlencode
 
import scrapy
 
 
class ChartSpider(scrapy.Spider):
    name = "chart"
    allowed_domains = ["data.example.net"]
    api_url = "https://data.example.net/v1/charts"
 
    def __init__(
        self,
        symbol="MSFT",
        interval="1d",
        start="2026-04-01",
        end="2026-04-03",
        *args,
        **kwargs,
    ):
        super().__init__(*args, **kwargs)
        self.symbol = symbol.strip().upper()
        self.interval = interval.strip()
        self.start_date = self._parse_day(start, "start")
        self.end_date = self._parse_day(end, "end")
 
    def _parse_day(self, value, label):
        try:
            return date.fromisoformat(str(value)).isoformat()
        except ValueError as exc:
            raise ValueError(f"{label} must use YYYY-MM-DD") from exc
 
    def _headers(self):
        headers = {"Accept": "application/json"}
        api_token = os.getenv("CHART_API_TOKEN")
        if api_token:
            headers["Authorization"] = f"Bearer {api_token}"
        return headers
 
    async def start(self):
        params = urlencode(
            {
                "symbol": self.symbol,
                "interval": self.interval,
                "start": self.start_date,
                "end": self.end_date,
            }
        )
        yield scrapy.Request(
            url=f"{self.api_url}?{params}",
            headers=self._headers(),
            callback=self.parse,
        )
 
    def parse(self, response):
        payload = response.json()
        bars = payload.get("bars")
        if not isinstance(bars, list):
            self.logger.error("Missing bars list")
            return
 
        symbol = payload.get("symbol") or self.symbol
        for bar in bars:
            if not isinstance(bar, dict):
                continue
 
            yield {
                "symbol": symbol,
                "date": bar.get("date"),
                "open": bar.get("open"),
                "high": bar.get("high"),
                "low": bar.get("low"),
                "close": bar.get("close"),
                "volume": bar.get("volume"),
            }

symbol, interval, start, and end stay as normal spider arguments, so you can switch symbols or date windows at crawl time without editing the file. Convert Unix timestamps inside parse() before yielding the item when the provider does not already return an ISO-style date.

Update the project settings so the CSV header stays ordered and the crawl backs off under latency.

market_chart/settings.py

CONCURRENT_REQUESTS_PER_DOMAIN = 1
DOWNLOAD_DELAY = 1.0
 
AUTOTHROTTLE_ENABLED = True
AUTOTHROTTLE_START_DELAY = 1.0
AUTOTHROTTLE_MAX_DELAY = 10.0
AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0
 
FEED_EXPORT_FIELDS = [
    "symbol",
    "date",
    "open",
    "high",
    "low",
    "close",
    "volume",
]
FEED_EXPORT_ENCODING = "utf-8"

Market-data endpoints often enforce burst limits, so raising concurrency or removing the delay usually leads to HTTP 429 responses, shortened date ranges, or both.

Export a bearer token when the provider requires authenticated requests.
```
$ export CHART_API_TOKEN="replace-with-token"
```
Skip this step for public endpoints, and avoid leaving real tokens in saved shell transcripts or screenshots.

Related: How to set request headers in Scrapy

Run the spider with the symbol, interval, and date range you want to export.

$ scrapy crawl chart -a symbol=MSFT -a interval=1d -a start=2026-04-01 -a end=2026-04-03 -O msft-1d.csv
##### snipped #####
2026-04-22 05:49:16 [scrapy.extensions.feedexport] INFO: Stored csv feed (3 items) in: msft-1d.csv
2026-04-22 05:49:16 [scrapy.core.engine] INFO: Spider closed (finished)

-O overwrites the output file, while -o appends only when the chosen feed format supports appending.

Read the saved CSV to confirm the header order and one row for each returned bar.

$ cat msft-1d.csv
symbol,date,open,high,low,close,volume
MSFT,2026-04-01,381.1,384.5,380.6,383.9,20311452
MSFT,2026-04-02,384.0,386.4,382.3,385.7,18744106
MSFT,2026-04-03,385.9,388.1,384.8,387.6,19180234