Running an Nginx failover test validates that the cluster can move the web VIP and service cleanly during maintenance or an unexpected node outage.
In a Pacemaker cluster managed by pcs, a web stack is typically modeled as a resource group containing an IPaddr2 virtual IP and a systemd-managed nginx service. Placing a node into standby triggers the scheduler to relocate the group to an eligible peer while keeping the cluster online.
Failover can interrupt active connections when the VIP changes owner, so scheduling during a maintenance window reduces impact. Cluster policies such as location constraints and resource stickiness can influence where the group lands and whether it fails back automatically after standby is cleared.
Steps to run an Nginx failover test in Pacemaker:
- Confirm the cluster is online and has quorum.
$ sudo pcs status Cluster name: clustername Cluster Summary: * Stack: corosync (Pacemaker is running) * Current DC: node-01 (version 2.1.6-6fdc9deea29) - partition with quorum * Last updated: Thu Jan 1 00:48:52 2026 on node-01 ##### snipped #####
- Identify the node currently hosting the Nginx resource group.
$ sudo pcs status resources * Clone Set: dummy-check-clone [dummy-check]: * Started: [ node-01 node-02 node-03 ] ##### snipped ##### * Resource Group: web-stack: * web_ip (ocf:heartbeat:IPaddr2): Started node-01 * web_service (systemd:nginx): Started node-01 - Put the current web node into standby.
$ sudo pcs node standby node-01
Connections to the VIP can reset when the address moves to another node.
- Confirm the resource group is started on a different node.
$ sudo pcs status resources * Clone Set: dummy-check-clone [dummy-check]: * Started: [ node-02 node-03 ] * Stopped: [ node-01 ] ##### snipped ##### * Resource Group: web-stack: * web_ip (ocf:heartbeat:IPaddr2): Started node-02 * web_service (systemd:nginx): Started node-02Allow time for relocation when the group includes health checks or a start timeout.
- Test the VIP with an HTTP request.
$ curl -sI http://192.0.2.50/ HTTP/1.1 200 OK Server: nginx/1.24.0 (Ubuntu) Date: Thu, 01 Jan 2026 00:49:03 GMT Content-Type: text/html Content-Length: 10671 Last-Modified: Sun, 28 Dec 2025 06:15:52 GMT Connection: keep-alive ETag: "6950cb18-29af" Accept-Ranges: bytes
- Return the original node to active service.
$ sudo pcs node unstandby node-01
Clearing standby restores eligibility but does not force an immediate failback unless cluster policy or constraints require it.
- Confirm the cluster is healthy after the test.
$ sudo pcs status Cluster name: clustername Cluster Summary: * Stack: corosync (Pacemaker is running) * Current DC: node-01 (version 2.1.6-6fdc9deea29) - partition with quorum * Last updated: Thu Jan 1 00:49:04 2026 on node-01 ##### snipped ##### Full List of Resources: * Resource Group: web-stack: * web_ip (ocf:heartbeat:IPaddr2): Started node-02 * web_service (systemd:nginx): Started node-02 ##### snipped #####
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
