Stock chart data drives candlestick charts, indicators, alerts, and backtests, so consistent time-series bars keep analytics repeatable and easier to troubleshoot.

Most market data providers expose an API endpoint that returns OHLC (often OHLCV) bars per symbol, interval, and date range. A Scrapy spider can request that endpoint, parse each bar from JSON, and emit one item per bar so the feed exporter writes clean rows into a chart-friendly format such as .csv.

Market data APIs differ in licensing, rate limits, symbol formats, and timestamp conventions (ISO dates vs epoch seconds or milliseconds). Conservative throttling reduces gaps and HTTP 429 responses, while careful field mapping avoids subtle errors like swapped timestamps or adjusted prices that change after splits and dividends.

Steps to get stock chart data with Scrapy:

  1. Fetch a sample OHLC response from the chart endpoint.
    $ curl -s 'http://api.example.net:8000/api/stock?symbol=EXMPL&interval=1d&start=2025-12-29&end=2025-12-31' | head -n 20
    {
      "symbol": "EXMPL",
      "prices": [
        {
          "date": "2025-12-29",
          "close": 128.4
        }
    ##### snipped #####

    Confirm the bar keys (such as date/close) and timestamp format before mapping fields in the spider.

  2. Create a new Scrapy project for the chart spider.
    $ scrapy startproject stock_chart
    New Scrapy project 'stock_chart', using template directory '##### snipped #####', created in:
        /root/sg-work/stock_chart
    
    You can start your first spider with:
        cd stock_chart
        scrapy genspider example example.com
  3. Change into the project directory.
    $ cd stock_chart
  4. Generate a spider for the market data host.
    $ scrapy genspider chart api.example.net
    Created spider 'chart' using template 'basic' in module:
      stock_chart.spiders.chart
  5. Export an API token to the environment when authentication is required.
    $ export MARKET_API_TOKEN="replace-with-token"

    Using an environment variable avoids hard-coding credentials in the spider.

    Pasting tokens into a shell can leak them through history or logs on shared systems.

  6. Edit the spider to request chart data, emitting OHLC rows as items.
    stock_chart/spiders/chart.py
    import json
    import os
    import re
    from urllib.parse import urlencode
     
    import scrapy
     
     
    _DATE_RE = re.compile(r"^\d{4}-\d{2}-\d{2}$")
     
     
    class ChartSpider(scrapy.Spider):
        name = "chart"
        allowed_domains = ["api.example.net"]
        base_url = "http://api.example.net:8000/api/stock"
     
        def __init__(
            self,
            symbol="EXMPL",
            interval="1d",
            start="2025-12-29",
            end="2025-12-31",
            *args,
            **kwargs,
        ):
            super().__init__(*args, **kwargs)
            self.symbol = self._clean_symbol(symbol)
            self.interval = self._clean_interval(interval)
            self.start_date = self._clean_date(start, "start")
            self.end_date = self._clean_date(end, "end")
     
        def _clean_symbol(self, symbol):
            cleaned = ("" if symbol is None else str(symbol)).strip().upper()
            if not cleaned:
                raise ValueError("symbol must not be empty")
            return cleaned
     
        def _clean_interval(self, interval):
            cleaned = ("" if interval is None else str(interval)).strip()
            if not cleaned:
                raise ValueError("interval must not be empty")
            return cleaned
     
        def _clean_date(self, value, label):
            cleaned = ("" if value is None else str(value)).strip()
            if not _DATE_RE.match(cleaned):
                raise ValueError(f"{label} must be in YYYY-MM-DD format")
            return cleaned
     
        def start_requests(self):
            params = {
                "interval": self.interval,
                "start": self.start_date,
                "end": self.end_date,
            }
            url = f"{self.base_url}?{urlencode(params)}"
            headers = {
                "Accept": "application/json",
                "User-Agent": "stock_chart (+http://app.internal.example:8000/)",
            }
            api_token = os.getenv("MARKET_API_TOKEN")
            if api_token:
                headers["Authorization"] = f"Bearer {api_token}"
            yield scrapy.Request(url=url, headers=headers, callback=self.parse)
     
        def parse(self, response):
            if response.status >= 400:
                self.logger.error("Chart endpoint returned HTTP %s", response.status)
                return
     
            try:
                payload = json.loads(response.text)
            except json.JSONDecodeError:
                self.logger.error(
                    "Non-JSON response from chart endpoint (status=%s)", response.status
                )
                return
     
            prices = payload.get("prices", [])
            if not isinstance(prices, list):
                self.logger.error("Missing or invalid 'prices' array in chart payload")
                return
     
            symbol = payload.get("symbol") or self.symbol
            for bar in prices:
                if not isinstance(bar, dict):
                    continue
                yield {
                    "symbol": symbol,
                    "date": bar.get("date"),
                    "close": bar.get("close"),
                }

    Spider arguments symbol, interval, start, and end override defaults at runtime without code edits.

  7. Set conservative throttling values in settings.py.
    stock_chart/settings.py
    DOWNLOAD_DELAY = 1.0
    CONCURRENT_REQUESTS_PER_DOMAIN = 2
     
    AUTOTHROTTLE_ENABLED = True
    AUTOTHROTTLE_START_DELAY = 1.0
    AUTOTHROTTLE_MAX_DELAY = 10.0
    AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0

    Aggressive request rates commonly trigger HTTP 429 throttling or temporary blocks on market data APIs.

  8. Run the spider to export chart data to a CSV file.
    $ scrapy crawl chart -a symbol=EXMPL -a interval=1d -a start=2025-12-29 -a end=2025-12-31 -O exmpl-close.csv
    ##### snipped #####
    [scrapy.extensions.feedexport] INFO: Stored csv feed (3 items) in: exmpl-close.csv
    [scrapy.core.engine] INFO: Closing spider (finished)
    {'downloader/request_count': 2, 'item_scraped_count': 3}

    Use -O to overwrite an existing output file, or -o to append.

  9. Verify the CSV output contains chart rows.
    $ head -n 5 exmpl-close.csv
    symbol,date,close
    EXMPL,2025-12-29,128.4
    EXMPL,2025-12-30,129.9
    EXMPL,2025-12-31,131.2
    $ wc -l exmpl-close.csv
    4 exmpl-close.csv