Changing the User-Agent header in Scrapy changes the identifier sent with each outgoing request, which helps a spider match the browser profile, device class, or client signature that a target site expects to see. That matters when the site varies content by client type or blocks the default crawler signature before the request ever reaches the normal page flow.
Scrapy still defaults to a User-Agent string in the form Scrapy/<version> (+https://scrapy.org) through UserAgentMiddleware. Setting USER_AGENT in the project changes that default for spiders that inherit project settings, and scrapy fetch is useful for verification because it downloads a URL the way the spider would download it.
Changing only the User-Agent header does not bypass rate limits, JavaScript challenges, or more advanced anti-bot checks, and a mobile browser string paired with desktop-only headers can still look suspicious. Test the new value with one request first, keep the rest of the request profile consistent, and stay within the target site's published access policy.
Related: How to set request headers in Scrapy
Related: How to use an HTTP proxy in Scrapy
Steps to change the user agent for Scrapy spiders:
- Change to the Scrapy project directory before reading or changing project settings.
$ cd /srv/catalog_demo
- Fetch a header echo endpoint to confirm the default User-Agent that Scrapy is sending now.
$ scrapy fetch --nolog http://app.internal.example:8000/headers { "headers": { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Language": "en", "User-Agent": "Scrapy/2.15.0 (+https://scrapy.org)", "Accept-Encoding": "gzip, deflate", "Host": "app.internal.example:8000" } }
- Choose the browser User-Agent string that matches the client profile you want the site to see.
Use the exact browser or device family you need for the target flow, because a desktop string and a mobile string can trigger different layouts, redirects, or anti-bot checks.
- Test the new value on one command run before making it the project default.
$ scrapy fetch --nolog http://app.internal.example:8000/headers --set USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36" { "headers": { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Language": "en", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36", "Accept-Encoding": "gzip, deflate", "Host": "app.internal.example:8000" } }
Command-line settings have the highest precedence for that one run only. Related: How to override Scrapy settings from the command line
- Open the project settings file in a text editor.
$ vi catalog_demo/settings.py
- Find the scaffolded USER_AGENT line or add the setting if the file no longer contains it.
#USER_AGENT = "catalog_demo (+http://www.yourdomain.com)" - Set USER_AGENT in settings.py to the value that every spider in the project should inherit by default.
USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"
Use the project setting for a shared default, and keep request-specific exceptions in request headers or spider-specific overrides instead of rewriting the whole project around one target.
Invalid Python syntax in settings.py prevents Scrapy commands, spiders, and scheduled runs from starting.
- Read the resolved setting from the project directory to confirm Scrapy is loading the new value.
$ scrapy settings --get USER_AGENT Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36
If one spider still sends a different value, inspect that spider for custom_settings = {"USER_AGENT": "..."} or the deprecated user_agent attribute, because spider-level overrides outrank the project default.
- Fetch the header echo endpoint again to verify the updated User-Agent leaves the crawler.
$ scrapy fetch --nolog http://app.internal.example:8000/headers { "headers": { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Language": "en", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36", "Accept-Encoding": "gzip, deflate", "Host": "app.internal.example:8000" } }
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
