Running an Nginx failover test validates that the cluster can move the web VIP and service cleanly during maintenance or an unexpected node outage.

In a Pacemaker cluster managed by pcs, a web stack is typically modeled as a resource group containing an IPaddr2 virtual IP and a systemd-managed nginx service. Placing a node into standby triggers the scheduler to relocate the group to an eligible peer while keeping the cluster online.

Failover can interrupt active connections when the VIP changes owner, so scheduling during a maintenance window reduces impact. Cluster policies such as location constraints and resource stickiness can influence where the group lands and whether it fails back automatically after standby is cleared.

Steps to run an Nginx failover test in Pacemaker:

  1. Confirm the cluster is online and has quorum.
    $ sudo pcs status
    Cluster name: clustername
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: node-01 (version 2.1.6-6fdc9deea29) - partition with quorum
      * Last updated: Thu Jan  1 00:48:52 2026 on node-01
    ##### snipped #####
  2. Identify the node currently hosting the Nginx resource group.
    $ sudo pcs status resources
      * Clone Set: dummy-check-clone [dummy-check]:
        * Started: [ node-01 node-02 node-03 ]
    ##### snipped #####
      * Resource Group: web-stack:
        * web_ip (ocf:heartbeat:IPaddr2): Started node-01
        * web_service (systemd:nginx): Started node-01
  3. Put the current web node into standby.
    $ sudo pcs node standby node-01

    Connections to the VIP can reset when the address moves to another node.

  4. Confirm the resource group is started on a different node.
    $ sudo pcs status resources
      * Clone Set: dummy-check-clone [dummy-check]:
        * Started: [ node-02 node-03 ]
        * Stopped: [ node-01 ]
    ##### snipped #####
      * Resource Group: web-stack:
        * web_ip (ocf:heartbeat:IPaddr2): Started node-02
        * web_service (systemd:nginx): Started node-02

    Allow time for relocation when the group includes health checks or a start timeout.

  5. Test the VIP with an HTTP request.
    $ curl -sI http://192.0.2.50/
    HTTP/1.1 200 OK
    Server: nginx/1.24.0 (Ubuntu)
    Date: Thu, 01 Jan 2026 00:49:03 GMT
    Content-Type: text/html
    Content-Length: 10671
    Last-Modified: Sun, 28 Dec 2025 06:15:52 GMT
    Connection: keep-alive
    ETag: "6950cb18-29af"
    Accept-Ranges: bytes
  6. Return the original node to active service.
    $ sudo pcs node unstandby node-01

    Clearing standby restores eligibility but does not force an immediate failback unless cluster policy or constraints require it.

  7. Confirm the cluster is healthy after the test.
    $ sudo pcs status
    Cluster name: clustername
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: node-01 (version 2.1.6-6fdc9deea29) - partition with quorum
      * Last updated: Thu Jan  1 00:49:04 2026 on node-01
    ##### snipped #####
    Full List of Resources:
      * Resource Group: web-stack:
        * web_ip (ocf:heartbeat:IPaddr2): Started node-02
        * web_service (systemd:nginx): Started node-02
    ##### snipped #####