A RabbitMQ active-active deployment keeps multiple broker nodes accepting client connections at the same time, spreading load while keeping publishing and consuming available when a single node fails.

In a Pacemaker and Corosync cluster, the pcs CLI can register RabbitMQ as a systemd resource and run it as a clone so one broker instance runs on each cluster node while the cluster monitors health and restarts failures.

Active-active at the service layer still depends on a working RabbitMQ cluster with consistent Erlang cookie, cluster membership, and queue policies, plus a reliable traffic distribution layer such as a load balancer or DNS that steers clients to healthy nodes. Quorum and fencing remain critical to avoid split-brain behavior that can corrupt state or lose messages.

Steps to set up RabbitMQ active-active with PCS:

  1. Confirm the Pacemaker cluster is online with quorum.
    $ sudo pcs status
    Cluster name: clustername
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: node-01 (version 2.1.6-6fdc9deea29) - partition with quorum
      * 3 nodes configured
      * 0 resource instances configured
    ##### snipped #####
  2. Confirm the RabbitMQ cluster reports the expected running nodes.
    $ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@node-01 ...
    Basics
    
    Cluster name: rabbit@node-01
    Total CPU cores available cluster-wide: 30
    
    Disk Nodes
    
    rabbit@node-01
    rabbit@node-02
    rabbit@node-03
    
    Running Nodes
    
    rabbit@node-01
    rabbit@node-02
    rabbit@node-03
    ##### snipped #####

    Running Nodes should include every node intended to run the cloned rabbitmq-server service.

  3. Identify the RabbitMQ service unit name.
    $ systemctl list-unit-files --type=service | grep -E '^rabbitmq-server\.service'
    rabbitmq-server.service                      enabled         enabled
  4. Disable the RabbitMQ systemd unit to prevent startup outside pcs on boot.
    $ sudo systemctl disable rabbitmq-server.service
    Synchronizing state of rabbitmq-server.service with SysV service script with /usr/lib/systemd/systemd-sysv-install.
    Executing: /usr/lib/systemd/systemd-sysv-install disable rabbitmq-server
    Removed "/etc/systemd/system/multi-user.target.wants/rabbitmq-server.service".
    ##### snipped #####

    Disabling the unit prevents automatic startup on reboot, so the cluster must be healthy to start RabbitMQ during node or service recovery.

  5. Create the RabbitMQ service resource.
    $ sudo pcs resource create rabbitmq_service systemd:rabbitmq-server op start timeout=180s op stop timeout=120s op monitor interval=30s
  6. Clone the RabbitMQ service resource across nodes.
    $ sudo pcs resource clone rabbitmq_service
  7. Verify the cloned resource is started on every node.
    $ sudo pcs status resources
      * Clone Set: rabbitmq_service-clone [rabbitmq_service]:
        * rabbitmq_service	(systemd:rabbitmq-server):	 Starting node-02
        * rabbitmq_service	(systemd:rabbitmq-server):	 Starting node-03
        * Started: [ node-01 ]

    Wait for every node to show Started before routing production traffic.

  8. Update client routing to distribute traffic across active nodes.
  9. Run a failover test to validate routing and recovery.