Benchmarking Nginx with wrk turns tuning work into numbers that can be compared across changes, making regressions obvious and improvements defensible. A repeatable baseline is especially useful when adjusting worker counts, buffers, caching, TLS settings, or upstream timeouts.

wrk is a multi-threaded load generator for HTTP/1.1 that keeps many connections open and issues requests as fast as possible, reporting throughput plus average and percentile latency when --latency is enabled. Results are sensitive to request shape (path, headers, body size), connection reuse (keep-alive), and network conditions, so a single well-defined URL and a fixed option set matter more than absolute numbers.

Synthetic load can overwhelm a server and also mislead when the load generator becomes CPU- or network-bound, so runs belong in controlled test windows with server-side metrics visible. Prefer a dedicated client host close to the server, keep the target endpoint stable (avoid redirects), and treat each run as a comparison between configurations rather than a promise of production capacity.

Steps to benchmark Nginx with wrk:

  1. Confirm the benchmark URL returns 200 with no redirects.
    $ curl --head http://example.com/
    HTTP/1.1 200 OK
    Server: nginx
    Date: Sun, 14 Dec 2025 12:00:00 GMT
    Content-Type: text/html
    Content-Length: 648
    Connection: keep-alive

    A 301/308 redirect adds an extra request per hit, so benchmark the final URL (often by adding a trailing /).

  2. Record the Nginx version plus the exact configuration under test before capturing a baseline.
    $ nginx -v 2>&1
    nginx version: nginx/1.24.0

    Capturing effective config with nginx -T improves repeatability, but the output may include sensitive paths and upstream details.

  3. Run a short warm-up from the load generator to stabilize caches before measuring.
    $ wrk --threads 2 --connections 50 --duration 10s --timeout 10s http://example.com/
    Running 10s test @ http://example.com/
      2 threads and 50 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency    11.42ms    8.77ms  96.10ms   86.55%
        Req/Sec     2.06k   142.18     2.41k    88.00%
      41232 requests in 10.02s, 49.34MB read
    Requests/sec:   4116.52
    Transfer/sec:      4.92MB

    If the client CPU is saturated, results reflect the load generator rather than Nginx.

  4. Run a baseline benchmark with --latency enabled while saving the output for comparison.
    $ wrk --threads 2 --connections 50 --duration 30s --timeout 10s --latency http://example.com/ | tee wrk-baseline.txt
    Running 30s test @ http://example.com/
      2 threads and 50 connections
      Thread Stats   Avg      Stdev     Max   +/- Stdev
        Latency    12.08ms   10.91ms 189.82ms   92.41%
        Req/Sec     2.12k   155.23     2.55k    84.00%
      Latency Distribution
         50%   10.21ms
         75%   14.96ms
         90%   22.88ms
         99%   94.73ms
      127456 requests in 30.10s, 152.40MB read
    Requests/sec:   4234.64
    Transfer/sec:      5.06MB

    Do not run load tests against production endpoints without capacity planning, approval, and a rollback plan.

  5. Query stub_status during the run to spot saturation signals like rising active connections or stalled requests.
    $ curl -s http://127.0.0.1/nginx_status
    Active connections: 12
    server accepts handled requests
      108930 108930 217860
    Reading: 0 Writing: 4 Waiting: 8

    Auto-refresh with watch -n 1 curl -s http://127.0.0.1/nginx_status when interactive monitoring is needed.

  6. Check /var/log/nginx/error.log immediately after the run for timeouts, upstream errors, or worker crashes.
    $ sudo tail -n 50 /var/log/nginx/error.log
    ##### snipped #####

    Latency improvements are not meaningful if the run introduces 5xx errors or upstream failures.

  7. Repeat the benchmark after each tuning change while keeping the URL and wrk options identical.
    $ wrk --threads 2 --connections 50 --duration 30s --timeout 10s --latency http://example.com/ | tee wrk-after-change.txt

    Only one variable should change per comparison run, otherwise the cause of a difference is ambiguous.

  8. Increase --connections in small steps to find the point where latency spikes or socket timeouts appear.
    $ wrk --threads 2 --connections 200 --duration 30s --timeout 10s --latency http://example.com/ | tee wrk-c200.txt

    Driving the server into sustained timeouts can trigger autoscaling, upstream circuit breakers, or cascading failures in shared environments.

  9. Compare Requests/sec and percentile latency across saved results to confirm the desired direction of change.
    $ grep -E "Requests/sec:|50%|90%|99%" -n wrk-*.txt
    wrk-after-change.txt:9:     50%   9.88ms
    wrk-after-change.txt:11:     90%  20.41ms
    wrk-after-change.txt:12:     99%  88.02ms
    wrk-after-change.txt:16:Requests/sec:   4410.12
    ##### snipped #####
Discuss the article:

Comment anonymously. Login not required.