Stock chart data gives Scrapy a clean way to collect OHLCV bars for candlestick charts, indicators, and backtests without scraping rendered tables. Exporting one row per bar keeps the result easy to reload into Python, spreadsheets, or charting tools.
Most market-data providers expose chart history through a JSON endpoint keyed by symbol, interval, and date range. A Scrapy spider can request that endpoint, read the returned bar array, and send the rows straight to a .csv feed so the crawl produces a reusable time-series file in one run.
Providers differ in auth headers, array names, adjusted-price rules, and whether timestamps arrive as ISO dates or Unix time, so the request URL and parse logic should match the current API contract before you reuse the crawl. Current Scrapy releases use async def start() for initial requests, while projects that still target releases older than 2.13 should also keep a start_requests() version for compatibility.
Related: How to scrape a JSON API with Scrapy
Related: How to export Scrapy items to CSV
$ curl -s 'https://data.example.net/v1/charts?symbol=MSFT&interval=1d&start=2026-04-01&end=2026-04-03'
{
"symbol": "MSFT",
"interval": "1d",
"bars": [
{
"date": "2026-04-01",
"open": 381.1,
"high": 384.5,
"low": 380.6,
"close": 383.9,
"volume": 20311452
},
{
"date": "2026-04-02",
"open": 384.0,
"high": 386.4,
"low": 382.3,
"close": 385.7,
"volume": 18744106
},
{
"date": "2026-04-03",
"open": 385.9,
"high": 388.1,
"low": 384.8,
"close": 387.6,
"volume": 19180234
}
]
}
Confirm whether the response uses bars, candles, or another array name, whether the date is already text or a Unix timestamp, and whether the endpoint serves adjusted or raw prices.
$ scrapy startproject market_chart
New Scrapy project 'market_chart', using template directory '##### snipped #####', created in:
/srv/market_chart
You can start your first spider with:
cd market_chart
scrapy genspider example example.com
$ cd market_chart
$ scrapy genspider chart data.example.net Created spider 'chart' using template 'basic' in module: market_chart.spiders.chart
from datetime import date import os from urllib.parse import urlencode import scrapy class ChartSpider(scrapy.Spider): name = "chart" allowed_domains = ["data.example.net"] api_url = "https://data.example.net/v1/charts" def __init__( self, symbol="MSFT", interval="1d", start="2026-04-01", end="2026-04-03", *args, **kwargs, ): super().__init__(*args, **kwargs) self.symbol = symbol.strip().upper() self.interval = interval.strip() self.start_date = self._parse_day(start, "start") self.end_date = self._parse_day(end, "end") def _parse_day(self, value, label): try: return date.fromisoformat(str(value)).isoformat() except ValueError as exc: raise ValueError(f"{label} must use YYYY-MM-DD") from exc def _headers(self): headers = {"Accept": "application/json"} api_token = os.getenv("CHART_API_TOKEN") if api_token: headers["Authorization"] = f"Bearer {api_token}" return headers async def start(self): params = urlencode( { "symbol": self.symbol, "interval": self.interval, "start": self.start_date, "end": self.end_date, } ) yield scrapy.Request( url=f"{self.api_url}?{params}", headers=self._headers(), callback=self.parse, ) def parse(self, response): payload = response.json() bars = payload.get("bars") if not isinstance(bars, list): self.logger.error("Missing bars list") return symbol = payload.get("symbol") or self.symbol for bar in bars: if not isinstance(bar, dict): continue yield { "symbol": symbol, "date": bar.get("date"), "open": bar.get("open"), "high": bar.get("high"), "low": bar.get("low"), "close": bar.get("close"), "volume": bar.get("volume"), }
symbol, interval, start, and end stay as normal spider arguments, so you can switch symbols or date windows at crawl time without editing the file. Convert Unix timestamps inside parse() before yielding the item when the provider does not already return an ISO-style date.
CONCURRENT_REQUESTS_PER_DOMAIN = 1 DOWNLOAD_DELAY = 1.0 AUTOTHROTTLE_ENABLED = True AUTOTHROTTLE_START_DELAY = 1.0 AUTOTHROTTLE_MAX_DELAY = 10.0 AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0 FEED_EXPORT_FIELDS = [ "symbol", "date", "open", "high", "low", "close", "volume", ] FEED_EXPORT_ENCODING = "utf-8"
Market-data endpoints often enforce burst limits, so raising concurrency or removing the delay usually leads to HTTP 429 responses, shortened date ranges, or both.
Related: How to set a download delay in Scrapy
Related: How to enable AutoThrottle in Scrapy
$ export CHART_API_TOKEN="replace-with-token"
Skip this step for public endpoints, and avoid leaving real tokens in saved shell transcripts or screenshots.
Related: How to set request headers in Scrapy
$ scrapy crawl chart -a symbol=MSFT -a interval=1d -a start=2026-04-01 -a end=2026-04-03 -O msft-1d.csv ##### snipped ##### 2026-04-22 05:49:16 [scrapy.extensions.feedexport] INFO: Stored csv feed (3 items) in: msft-1d.csv 2026-04-22 05:49:16 [scrapy.core.engine] INFO: Spider closed (finished)
-O overwrites the output file, while -o appends only when the chosen feed format supports appending.
$ cat msft-1d.csv symbol,date,open,high,low,close,volume MSFT,2026-04-01,381.1,384.5,380.6,383.9,20311452 MSFT,2026-04-02,384.0,386.4,382.3,385.7,18744106 MSFT,2026-04-03,385.9,388.1,384.8,387.6,19180234