Optimizing Nginx performance keeps latency stable and increases throughput by cutting per-request overhead, reusing connections efficiently, and avoiding repeated work during traffic spikes.
Nginx uses an event-driven worker model where each worker can serve many concurrent connections, so throughput is shaped by worker concurrency, keepalive reuse, upstream behavior, and protocol choices like HTTP/2 multiplexing or HTTP/3 over QUIC. When HTTPS is involved, TLS session reuse reduces handshake cost, while compression and caching reduce bytes-on-the-wire and upstream work.
Over-tuning can move bottlenecks instead of removing them: higher concurrency increases memory and file descriptor pressure, unsafe caching can leak personalized responses, and aggressive compression can trade bandwidth savings for CPU contention. Changes are safest when applied one at a time with repeatable benchmarks and metrics visible during the test window.
Related: How to benchmark Nginx with wrk
Related: How to enable the Nginx stub_status page
Steps to optimize Nginx web server performance:
- Capture a throughput and latency baseline with wrk using fixed options.
$ wrk -t2 -c50 -d30s http://127.0.0.1/ Running 30s test @ http://127.0.0.1/ 2 threads and 50 connections Thread Stats Avg Stdev Max +/- Stdev Latency 140.54us 319.28us 18.38ms 98.18% Req/Sec 174.37k 9.22k 193.56k 78.70% 10423925 requests in 30.10s, 8.60GB read Requests/sec: 346313.31 Transfer/sec: 292.61MBKeep URL, concurrency, and duration unchanged across comparison runs.
- Capture stub_status or equivalent connection metrics during each test run.
$ curl -s http://127.0.0.1/nginx_status Active connections: 1 server accepts handled requests 10453 10453 10423957 Reading: 0 Writing: 1 Waiting: 0
Expose status endpoints to trusted networks only.
- Enable gzip or Brotli compression for text assets.
Compression helps most for HTML, CSS, JSON, and JavaScript responses.
- Verify Content-Encoding negotiation on a representative text endpoint.
$ curl -I --silent -H 'Accept-Encoding: gzip' http://127.0.0.1/ | grep -i '^content-encoding:' Content-Encoding: gzip
No Content-Encoding header usually means no compression was applied or the content was already compressed.
- Enable HTTP/2 on HTTPS listeners where supported.
HTTP/2 reduces socket churn by multiplexing many requests over a single TCP connection.
Related: How to enable HTTP/2 in Nginx
- Confirm HTTP/2 negotiation with curl.
$ curl -I --http2 -k --silent https://127.0.0.1/ HTTP/2 200 server: nginx/1.24.0 (Ubuntu) date: Mon, 29 Dec 2025 22:13:02 GMT content-type: text/html content-length: 10671 last-modified: Sun, 28 Dec 2025 06:15:52 GMT etag: "6950cb18-29af" accept-ranges: bytes
- Enable HTTP/3 only when QUIC support is available end-to-end.
HTTP/3 depends on the Nginx build, TLS library support, and UDP reachability.
Related: How to enable HTTP/3 in Nginx
- Confirm HTTP/3 negotiation with a QUIC-capable curl build.
$ curl -I --http3 https://example.com/ HTTP/3 200 server: nginx ##### snipped #####
If testing fails with a missing-feature error, use a curl build compiled with HTTP/3 support.
- Enable TLS session caching for HTTPS traffic.
Session reuse reduces CPU spent on repeated handshakes from short-lived clients.
- Tune keepalive settings to reduce connection churn under load.
Overly large keepalive pools can consume worker_connections and memory with idle sockets.
Related: How to tune keepalive in Nginx
- Tune reverse-proxy behavior for proxied applications.
Upstream latency, buffering, and connection reuse often dominate end-to-end response time.
- Enable open_file_cache for high-volume static file serving.
Open file cache is most effective when serving many small files from local disk.
- Enable caching only for responses that are safe to share, with clear invalidation.
Caching personalized or authentication-dependent content can leak data between users.
Related: How to enable caching in Nginx
- Enable microcaching only for upstream responses that tolerate short staleness.
Even sub-second microcache TTLs can break per-user pages, CSRF token flows, and real-time dashboards.
Related: How to enable microcaching in Nginx
- Tune worker_processes to match available CPU resources.
Too many workers can increase context switching without improving throughput.
- Tune worker_connections for peak concurrent sockets.
Each connection consumes a file descriptor; raising worker_connections requires higher file descriptor limits.
- Increase file descriptor limits before pushing high concurrency.
$ ulimit -n 1048576
Interactive shell limits can differ from the nginx service limit.
- Reduce access log overhead on known high-traffic endpoints when measured as a bottleneck.
Disabling logs can hide incidents; reduce logging only where the impact is proven.
- Re-run the same wrk benchmark per change to validate improvement.
$ wrk -t2 -c50 -d30s http://127.0.0.1/ ##### snipped #####
Compare Requests/sec plus tail latency, not only averages.
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
