How to change the user agent for Scrapy spiders

Changing the User-Agent header allows a Scrapy spider to look like a regular browser instead of a crawler signature, which helps avoid simplistic blocks and can trigger the same layout or device-specific responses seen by real clients.

A web server receives the user-agent string on every HTTP request and may vary content or apply filters based on it. Scrapy sends a default value (Scrapy/<version> (+https://scrapy.org)) via its UserAgentMiddleware unless a request provides its own User-Agent header.

User-agent spoofing does not bypass more advanced anti-bot controls, and a mismatched header set (for example, a mobile user agent with desktop-only headers) can still be flagged. Keep crawler behavior within site policy, and prefer testing the string with a single request before making it the project default.

Steps to change the user agent for Scrapy spiders:

Fetch an endpoint that echoes the received user agent to confirm Scrapy's default header.

$ scrapy fetch --nolog http://app.internal.example:8000/headers
{
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en",
    "User-Agent": "Scrapy/2.11.1 (+https://scrapy.org)",
    "Accept-Encoding": "gzip, deflate, br",
    "Host": "app.internal.example:8000"
  }
}

Choose the browser user agent string to send in requests.

Related: List of browser user agents

Override the USER_AGENT setting for a single command run using the --set option.

$ scrapy fetch --nolog http://app.internal.example:8000/headers --set=USER_AGENT="Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148"
{
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en",
    "User-Agent": "Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148",
    "Accept-Encoding": "gzip, deflate, br",
    "Host": "app.internal.example:8000"
  }
}

The override applies only to this command invocation.

Open the Scrapy project settings file in a text editor.
```
$ vi simplifiedguide/settings.py
```

Locate the USER_AGENT setting line in settings.py.

# Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = "simplifiedguide (+http://app.internal.example)"

Set the USER_AGENT value in settings.py to permanently change the user agent for the project.
```
USER_AGENT = 'Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148'
```
Invalid Python syntax in settings.py prevents spiders and scheduled jobs from starting.

Fetch the echo endpoint again to verify the project default user agent is in effect.

$ scrapy fetch --nolog http://app.internal.example:8000/headers
{
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en",
    "User-Agent": "Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148",
    "Accept-Encoding": "gzip, deflate, br",
    "Host": "app.internal.example:8000"
  }
}

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.