A PostgreSQL failover test confirms that the clustered database service and its floating client endpoint can move cleanly when a node is placed into maintenance or drops out unexpectedly.
In a Pacemaker cluster, resources such as the virtual IP (for example IPaddr2) and the PostgreSQL service (for example a systemd unit) are often grouped so they start and stop together. Putting the active node into standby using the pcs CLI forces the scheduler to relocate the entire resource group to another eligible node and bring it online there.
Failover tests can terminate active sessions and briefly interrupt connectivity while the VIP and service restart on the new node. Loss of quorum or a misconfigured fencing policy can trigger stops or fencing actions, so the test belongs in a maintenance window with at least one healthy secondary node capable of hosting the database workload and data.
Steps to run a PostgreSQL failover test in Pacemaker:
- Confirm the cluster is online with quorum.
$ sudo pcs status Cluster name: clustername Cluster Summary: * Stack: corosync (Pacemaker is running) * Current DC: node-01 (version 2.1.6-6fdc9deea29) - partition with quorum * Last updated: Thu Jan 1 05:40:18 2026 on node-01 * Last change: Thu Jan 1 05:40:07 2026 by root via cibadmin on node-01 * 3 nodes configured * 2 resource instances configured Node List: * Online: [ node-01 node-02 node-03 ] Full List of Resources: * Resource Group: db-stack: * db_ip (ocf:heartbeat:IPaddr2): Started node-01 * db_service (systemd:postgresql@16-main): Started node-01 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled - Identify the PostgreSQL resource group and its current host node.
$ sudo pcs status resources * Resource Group: db-stack: * db_ip (ocf:heartbeat:IPaddr2): Started node-01 * db_service (systemd:postgresql@16-main): Started node-01 - Put the current database node into standby.
$ sudo pcs node standby node-01
Active client sessions through the floating IP can drop during the move.
- Confirm the resource group moved to another node.
$ sudo pcs status resources * Resource Group: db-stack: * db_ip (ocf:heartbeat:IPaddr2): Started node-02 * db_service (systemd:postgresql@16-main): Starting node-02Resource startup order is enforced by the group, so the VIP and service should appear started on the same node.
- Test database connectivity via the floating IP.
$ psql "host=192.0.2.30 user=appuser dbname=appdb" -c "SELECT 1;" ?column? ---------- 1 (1 row)A client-host connectivity check validates the same path used by applications.
- Return the original node to active service.
$ sudo pcs node unstandby node-01
Resources may remain on the failover node until a separate move or policy causes relocation.
- Confirm the cluster is healthy after the test.
$ sudo pcs status Cluster name: clustername Cluster Summary: * Stack: corosync (Pacemaker is running) * Current DC: node-01 (version 2.1.6-6fdc9deea29) - partition with quorum * Last updated: Thu Jan 1 05:40:25 2026 on node-01 * Last change: Thu Jan 1 05:40:25 2026 by root via cibadmin on node-01 * 3 nodes configured * 2 resource instances configured Node List: * Online: [ node-01 node-02 node-03 ] Full List of Resources: * Resource Group: db-stack: * db_ip (ocf:heartbeat:IPaddr2): Started node-02 * db_service (systemd:postgresql@16-main): Started node-02 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.
