Scrapy
is a Python
-based scraping and web crawling program and is generally available as a pip
package. Some Linux
distributions like Ubuntu
and Debian
however have Scrapy
in its default package repository and can be installed via apt
.
Ubuntu
version of Scrapy
is more tightly integrated with the operating system in a way that it installs to the default application path and you don't need to install additional tools such as pip
just to have Scrapy
installed.
The installed version is however normally tied to the distribution version so you won't get the latest version of Scrapy
unless you also upgrade your Ubuntu
or Debian
version.
Scrapy
can be installed on Ubuntu
and Debian
using apt
at the terminal.
apt
's package list from repository. $ sudo apt update [sudo] password for user: Hit:1 http://jp.archive.ubuntu.com/ubuntu focal InRelease Get:2 http://jp.archive.ubuntu.com/ubuntu focal-updates InRelease [107 kB] Get:3 http://jp.archive.ubuntu.com/ubuntu focal-backports InRelease [98.3 kB] Get:4 http://jp.archive.ubuntu.com/ubuntu focal-security InRelease [107 kB] Fetched 312 kB in 7s (47.5 kB/s) Reading package lists... Done Building dependency tree Reading state information... Done All packages are up to date.
python3-scrapy
package using apt
. $ sudo apt install --assume-yes python3-scrapy Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: ipython3 libimagequant0 libjbig0 libjpeg-turbo8 libjpeg8 liblcms2-2 libmysqlclient21 libtiff5 libwebp6 libwebpdemux2 libwebpmux3 mysql-common python3-backcall python3-boto python3-bs4 python3-cssselect python3-decorator python3-html5lib python3-ipython python3-ipython-genutils python3-jedi python3-libxml2 python3-lxml python3-mysqldb python3-olefile python3-parsel python3-parso python3-pexpect python3-pickleshare python3-pil python3-prompt-toolkit python3-ptyprocess python3-pydispatch python3-pygments python3-queuelib python3-soupsieve python3-traitlets python3-w3lib python3-wcwidth python3-webencodings Suggested packages: liblcms2-utils python3-genshi python-ipython-doc python3-lxml-dbg python-lxml-doc default-mysql-server | virtual-mysql-server python3-mysqldb-dbg python-pexpect-doc python-pil-doc python3-pil-dbg python-pydispatch-doc python-pygments-doc ttf-bitstream-vera python-scrapy-doc The following NEW packages will be installed: ipython3 libimagequant0 libjbig0 libjpeg-turbo8 libjpeg8 liblcms2-2 libmysqlclient21 libtiff5 libwebp6 libwebpdemux2 libwebpmux3 mysql-common python3-backcall python3-boto python3-bs4 python3-cssselect python3-decorator python3-html5lib python3-ipython python3-ipython-genutils python3-jedi python3-libxml2 python3-lxml python3-mysqldb python3-olefile python3-parsel python3-parso python3-pexpect python3-pickleshare python3-pil python3-prompt-toolkit python3-ptyprocess python3-pydispatch python3-pygments python3-queuelib python3-scrapy python3-soupsieve python3-traitlets python3-w3lib python3-wcwidth python3-webencodings 0 upgraded, 41 newly installed, 0 to remove and 0 not upgraded. Need to get 7,080 kB of archives. After this operation, 37.6 MB of additional disk space will be used. ##### snipped
Scrapy
by running scrapy
command at the terminal. $ scrapy Scrapy 1.7.3 - no active project Usage: scrapy <command> [options] [args] Available commands: bench Run quick benchmark test fetch Fetch a URL using the Scrapy downloader genspider Generate new spider using pre-defined templates runspider Run a self-contained spider (without creating a project) settings Get settings values shell Interactive scraping console startproject Create new project version Print Scrapy version view Open URL in browser, as seen by Scrapy [ more ] More commands available when run from project directory Use "scrapy <command> -h" to see more info about a command
Comment anonymously. Login not required.