How to set up NFS server high availability with PCS

High availability for an NFS server keeps shared exports reachable during node reboots, patch cycles, and unexpected failures. Pairing the NFS workload with a floating IP (virtual IP) keeps the client mount target stable while the active node changes.

In a pcs-managed Pacemaker cluster, the NFS stack is modeled as a resource group: an OCF Filesystem resource mounts the shared export volume, a systemd resource starts the NFS daemon, and an OCF IPaddr2 resource assigns the floating IP. Grouping enforces start/stop ordering and colocation so the mount, daemon, and address move together as a unit.

NFS HA is active/passive and depends on shared storage with single-writer semantics for the export directory, such as a SAN LUN or a replicated block device presented to only one node at a time. The same /etc/exports entries must exist on every node, and the export filesystem must not be auto-mounted via /etc/fstab outside cluster control to avoid concurrent mounts and data corruption. Failover briefly interrupts client I/O and may require lock recovery, so short service interruptions should be expected during moves.

Steps to set up NFS server high availability with PCS:

Confirm the cluster is online and has quorum.

$ sudo pcs status
Cluster name: clustername
Cluster Summary:
  * Stack: corosync (Pacemaker is running)
  * Current DC: node-01 (version 2.1.6-6fdc9deea29) - partition with quorum
  * 3 nodes configured
  * 0 resource instances configured
##### snipped #####

Identify the NFS server service unit name.

$ systemctl list-unit-files --type=service | grep -E '^(nfs-server|nfs-kernel-server)\.service'
nfs-kernel-server.service                    alias           -
nfs-server.service                           disabled        enabled

Stop and disable the NFS server service on every cluster node.
```
$ sudo systemctl disable --now nfs-kernel-server.service
Synchronizing state of nfs-kernel-server.service with SysV service script with /usr/lib/systemd/systemd-sysv-install.
Executing: /usr/lib/systemd/systemd-sysv-install disable nfs-kernel-server
##### snipped #####
```
Leaving the NFS service enabled outside cluster control can result in exports running on multiple nodes, which risks filesystem corruption on single-writer storage.

Related: How to manage Pacemaker cluster services with systemctl in Linux
Create the export mount point directory on every cluster node.
```
$ sudo mkdir -p /srv/nfs
```
The mount point must exist even when the export filesystem is not mounted.
Create a filesystem resource for the shared export path.
```
$ sudo pcs resource create nfs_fs ocf:heartbeat:Filesystem device=/dev/loop10 directory=/srv/nfs fstype=xfs op monitor interval=20s
```
Use the shared block device and mount path for the cluster.

Do not mount /srv/nfs via /etc/fstab on boot when using a non-cluster filesystem such as xfs or ext4.

Related: How to get disk and partition UUIDs in Linux
Confirm the same /etc/exports entries exist on every node for the shared export path.
```
$ sudo awk '!/^\s*($|#)/ {print}' /etc/exports
/srv/nfs 192.0.2.0/24(rw,sync,no_subtree_check,root_squash)
```
The cluster moves the mount, service, and IP, but does not synchronize export definitions.

Create a floating IP resource for the NFS endpoint.

$ sudo pcs resource create nfs_ip ocf:heartbeat:IPaddr2 ip=192.0.2.71 cidr_netmask=24 op monitor interval=30s

Create the NFS server service resource.
```
$ sudo pcs resource create nfs_service systemd:nfs-kernel-server op monitor interval=30s
```
Use systemd:nfs-kernel-server when that unit is present.
Group the filesystem, service, and IP resources.
```
$ sudo pcs resource group add nfs-stack nfs_fs nfs_service nfs_ip
```
Related: How to create a Pacemaker resource group

Verify the resource group placement.

$ sudo pcs status resources
  * Resource Group: nfs-stack:
    * nfs_fs	(ocf:heartbeat:Filesystem):	 Started node-01
    * nfs_service	(systemd:nfs-kernel-server):	 Started node-01
    * nfs_ip	(ocf:heartbeat:IPaddr2):	 Started node-01

Confirm the export filesystem is mounted on the node hosting the resource group.

$ df -h /srv/nfs
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop10     336M   27M  310M   8% /srv/nfs

Confirm the exports are reachable through the floating IP.
```
$ showmount --exports 192.0.2.71
Export list for 192.0.2.71:
/srv/nfs 192.0.2.0/24
```
A full client mount test proves end-to-end access when rpc.mountd is filtered by firewall rules.
Run a failover test after the group is running.

Related: How to run a Pacemaker failover test with PCS

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.