Scrapy automates web crawling and data extraction, making repeatable collection jobs practical for large sites, internal dashboards, or scheduled ETL pipelines.

Ubuntu and Debian ship Scrapy as a system package (python3-scrapy) that installs via apt, pulls in required dependencies (parsers, TLS libraries, and the Twisted networking stack), and places the scrapy command on the standard system PATH.

Repository-packaged versions follow the distribution release, so the available Scrapy version may lag upstream; prefer the pip method inside a virtual environment when newer releases or isolated dependencies are required, and avoid mixing pip installs into the system Python to reduce package conflicts.

Steps to install Scrapy on Ubuntu or Debian:

  1. Open a terminal with sudo privileges.
  2. Update the apt package index.
    $ sudo apt update
    Get:1 http://ports.ubuntu.com/ubuntu-ports noble InRelease [256 kB]
    Get:2 http://ports.ubuntu.com/ubuntu-ports noble-updates InRelease [126 kB]
    Get:3 http://ports.ubuntu.com/ubuntu-ports noble-backports InRelease [126 kB]
    Get:4 http://ports.ubuntu.com/ubuntu-ports noble-security InRelease [126 kB]
    ##### snipped #####
    Reading package lists...
    Building dependency tree...
    Reading state information...
    9 packages can be upgraded. Run 'apt list --upgradable' to see them.
  3. Install the python3-scrapy package.
    $ sudo apt install --assume-yes python3-scrapy
    Reading package lists...
    Building dependency tree...
    Reading state information...
    The following additional packages will be installed:
      binutils binutils-aarch64-linux-gnu binutils-common blt ca-certificates cpp
    ##### snipped #####
    The following NEW packages will be installed:
      python3-scrapy python3-attr python3-automat python3-constantly
    ##### snipped #####
    0 upgraded, 211 newly installed, 0 to remove and 9 not upgraded.
    Need to get 158 MB of archives.
    After this operation, 620 MB of additional disk space will be used.
    ##### snipped #####
  4. Run scrapy to confirm the command is available.
    $ scrapy
    Scrapy 2.11.1 - no active project
    
    Usage:
      scrapy <command> [options] [args]
    
    Available commands:
      bench         Run quick benchmark test
      fetch         Fetch a URL using the Scrapy downloader
      genspider     Generate new spider using pre-defined templates
      runspider     Run a self-contained spider (without creating a project)
      settings      Get settings values
      shell         Interactive scraping console
      startproject  Create new project
      version       Print Scrapy version
      view          Open URL in browser, as seen by Scrapy
    
      [ more ]      More commands available when run from project directory
    
    Use "scrapy <command> -h" to see more info about a command