User-agent is a string that browsers use to identify themselves to the webserver. It is sent on every HTTP request in the request header, and in the case of Scrapy, it identifies as the following;
The web server could then be configured to respond accordingly based on the user agent string. A request from a mobile device, for example, could be served with mobile-specific content. However, some web servers are configured to block web scraping traffic altogether and are a problem when using Scrapy.
One way to avoid the issue is for Scrapy to change the user agent string and identify itself as any other browser.
$ scrapy fetch https://www.example.com
Also work with shell or any other method.
$ scrapy fetch https://www.example.com --set=USER_AGENT="custom user agent string"
Related: List of Browser User Agents
$ vi scrapyproject/settings.py
# Crawl responsibly by identifying yourself (and your website) on the user-agent #USER_AGENT = 'scraper (+http://www.yourdomain.com)'
USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'
Comment anonymously. Login not required.