DRBD fencing controls what happens when a primary resource loses contact with a peer and continuing writes could leave the peer with stale data. A fencing policy gives DRBD a handler path for constraining or fencing the other side before the cluster allows an unsafe promotion.

Pacemaker-managed DRBD resources normally use the handlers shipped with drbd-utils. crm-fence-peer.9.sh asks Pacemaker to stop promoting the fenced peer, and crm-unfence-peer.9.sh removes that restriction after the resource reconnects and synchronizes.

Use resource-only when the cluster manager can keep the peer from being promoted, and reserve resource-and-stonith for designs where the handler can confirm or power-fence the peer while local I/O waits. The resource file must be consistent on every node, and a controlled maintenance-window test is the only proof that the cluster fencing path works end to end.

Steps to configure DRBD fencing:

  1. Confirm the Pacemaker fencing handler scripts are installed.
    $ ls /usr/lib/drbd/crm-fence-peer.9.sh /usr/lib/drbd/crm-unfence-peer.9.sh
    /usr/lib/drbd/crm-fence-peer.9.sh
    /usr/lib/drbd/crm-unfence-peer.9.sh

    These handlers are normally installed with drbd-utils on Pacemaker-capable DRBD 9 systems.

  2. Back up the current resource file.
    $ sudo cp -a /etc/drbd.d/webdata.res /etc/drbd.d/webdata.res.bak

    Replace webdata with the resource file name used by the cluster.
    Related: How to back up DRBD metadata before a change

  3. Open the resource configuration file.
    $ sudoedit /etc/drbd.d/webdata.res
  4. Add the fencing policy and Pacemaker handlers to the resource.
    resource webdata {
        net {
            protocol C;
            fencing resource-only;
        }
        handlers {
            fence-peer "/usr/lib/drbd/crm-fence-peer.9.sh";
            unfence-peer "/usr/lib/drbd/crm-unfence-peer.9.sh";
        }
    
        # existing on <node> sections stay here
    }

    Use resource-and-stonith only when Pacemaker STONITH is already configured and tested. Without a working fencing device, a disconnected primary can wait with suspended I/O until the peer state is resolved.

  5. Install the same resource file on every DRBD node.

    The net and handlers sections must match across peers. A node with an older resource file can parse and apply different fencing behavior.

  6. Parse the active resource configuration.
    $ sudo drbdadm dump webdata
    # resource webdata on node-a: not ignored, not stacked
    # defined at /etc/drbd.d/webdata.res:1
    resource webdata {
        on node-a {
            node-id 0;
            volume 0 {
                device       minor 0;
                disk         /dev/vg0/webdata;
                meta-disk    internal;
            }
            address          ipv4 192.0.2.10:7788;
        }
    ##### snipped #####
        net {
            protocol           C;
            fencing          resource-only;
        }
        handlers {
            fence-peer       /usr/lib/drbd/crm-fence-peer.9.sh;
            unfence-peer     /usr/lib/drbd/crm-unfence-peer.9.sh;
        }
    }

    Run the parser check on each node that has an on section for the resource.
    Related: How to validate DRBD configuration

  7. Preview the runtime change without applying it.
    $ sudo drbdadm --dry-run adjust webdata
    drbdsetup new-resource webdata 0
    drbdsetup new-minor webdata 0 0
    drbdsetup new-peer webdata 1 --_name=node-b --fencing=resource-only --protocol=C
    drbdsetup new-path webdata 1 ipv4:192.0.2.10:7788 ipv4:192.0.2.11:7788
    drbdmeta 0 v09 /dev/vg0/webdata internal apply-al
    drbdsetup attach 0 /dev/vg0/webdata /dev/vg0/webdata internal
    drbdsetup connect webdata 1

    The dry run should show --fencing=resource-only, or --fencing=resource-and-stonith when that policy was intentionally configured.

  8. Apply the updated resource configuration.
    $ sudo drbdadm adjust webdata

    Apply the same validated file on all nodes before testing a disconnection. Mixed fencing settings can leave Pacemaker and DRBD making different promotion decisions.

  9. Check the resource state after the adjustment.
    $ sudo drbdadm status webdata
    webdata role:Primary
      volume:0 disk:UpToDate
      node-b role:Secondary
        volume:0 peer-disk:UpToDate

    Start the failure test only when the intended primary and secondary are connected and UpToDate.
    Related: How to check DRBD resource status

  10. Trigger a controlled fencing test in a lab or approved maintenance window.
    $ sudo drbdadm disconnect webdata

    This interrupts the replication connection for the resource. Do not run it on a production service without an approved failover and recovery plan.

  11. Check the Pacemaker promotion constraint created by the handler.
    $ sudo pcs constraint location
    Location Constraints:
      Resource: ms_drbd_webdata
        Disabled on: node-b (score:-INFINITY) (role:Promoted)

    Replace ms_drbd_webdata with the Pacemaker promotable clone name for the DRBD resource. No promotion constraint after a controlled disconnect means the handler path, Pacemaker resource name, or cluster communication needs correction.

  12. Reconnect the resource after the test.
    $ sudo drbdadm connect webdata
  13. Verify the resource returns to a connected, synchronized state.
    $ sudo drbdadm status webdata
    webdata role:Primary
      volume:0 disk:UpToDate
      node-b role:Secondary
        volume:0 peer-disk:UpToDate

    The crm-unfence-peer.9.sh handler should remove the Pacemaker promotion restriction after DRBD reconnects and synchronizes.