Failover testing for HAProxy in a Pacemaker cluster validates that a node drain relocates the load balancer and its virtual IP without manual intervention, reducing the risk of outages during maintenance and unplanned node loss.
In a pcs-managed Pacemaker and Corosync stack, the load balancer commonly runs as a resource group that couples an IPaddr2 virtual IP (VIP) resource with a systemd unit such as systemd:haproxy. Setting a node to standby marks it ineligible for resource placement, prompting the cluster scheduler to relocate the group to another eligible node.
Moving a VIP can reset in-flight TCP sessions and trigger brief health-check failures while ARP and routing converge. Quorum loss, fencing failures, or placement constraints can prevent relocation, so confirming cluster health before and after the test avoids ambiguous results.
Steps to run an HAProxy failover test in Pacemaker:
- Confirm the cluster is online with quorum.
$ sudo pcs status Cluster name: clustername Cluster Summary: * Stack: corosync (Pacemaker is running) * Current DC: node-01 (version 2.1.6-6fdc9deea29) - partition with quorum * Last updated: Thu Jan 1 00:50:00 2026 on node-01 ##### snipped #####
- Identify the node currently hosting the HAProxy resource group.
$ sudo pcs status resources * Clone Set: dummy-check-clone [dummy-check]: * Started: [ node-01 node-02 node-03 ] ##### snipped ##### * Resource Group: lb-stack: * lb_ip (ocf:heartbeat:IPaddr2): Started node-01 * lb_service (systemd:haproxy): Started node-01 - Test the VIP before starting the failover.
$ curl -sI http://192.0.2.40/ HTTP/1.1 200 OK content-length: 2 content-type: text/plain
- Put the node hosting the resource group into standby.
$ sudo pcs node standby node-01
Existing connections through the VIP can drop during relocation.
- Confirm the node is in standby state.
$ sudo pcs status nodes Pacemaker Nodes: Online: node-02 node-03 Standby: node-01 Standby with resource(s) running: Maintenance: Offline: Pacemaker Remote Nodes: Online: Standby: Standby with resource(s) running: Maintenance: Offline:
- Confirm the resource group relocated to another node.
$ sudo pcs status resources * Clone Set: dummy-check-clone [dummy-check]: * Started: [ node-02 node-03 ] * Stopped: [ node-01 ] ##### snipped ##### * Resource Group: lb-stack: * lb_ip (ocf:heartbeat:IPaddr2): Started node-02 * lb_service (systemd:haproxy): Started node-02Relocation is complete when all group members show Started on the same node.
- Verify the VIP address is present on the new active node.
$ ip -4 addr show | grep -F '192.0.2.40' inet 192.0.2.40/24 brd 192.0.2.255 scope global secondary eth0 - Test the VIP after the move.
$ curl -sI http://192.0.2.40/ HTTP/1.1 200 OK content-length: 2 content-type: text/plain
- Return the standby node to active service.
$ sudo pcs node unstandby node-01
Automatic failback is not guaranteed when resource-stickiness or location constraints prefer the current node.
- Confirm the cluster is healthy after the test.
$ sudo pcs status
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
