A GlusterFS cluster can lose redundancy before clients see hard failures. Checking peer connectivity, brick process state, heal backlog, and replication status gives operators a fast view of whether a volume is still safe to serve writes.
Run health checks from a node that belongs to the trusted storage pool. The gluster CLI reports pool membership, volume process state, heal queues, rebalance progress, and geo-replication workers through the management daemon, so a disconnected local daemon can hide cluster state even when brick data still exists on disk.
Health monitoring should inspect state without changing the cluster. Treat disconnected peers, bricks with Online set to N, nonzero split-brain entries, rebalance failures, faulty geo-replication workers, and recent warning or error logs as follow-up signals for the matching recovery guide or incident runbook.
Steps to monitor GlusterFS health:
- Confirm the local GlusterFS management service is running.
$ systemctl is-active glusterd active
Some distributions use glusterfs-server.service instead of glusterd.service. Use the installed unit name when checking service state.
Related: How to manage the GlusterFS service with systemctl - Check peer connectivity across the trusted storage pool.
$ sudo gluster peer status Number of Peers: 2 Hostname: node2 Uuid: 6770f88c-9ec5-4cf8-b9f5-658fa17b6bdc State: Peer in Cluster (Connected) Hostname: node3 Uuid: 5a3c65f3-1b4d-4d6e-93d4-4c24f0b6b5bf State: Peer in Cluster (Connected)
Peer in Cluster (Connected) means the node is reachable and participating in the trusted pool.
A disconnected peer can reduce redundancy, block management operations, or leave a replica set unable to heal.
- Check brick and daemon state for the volume.
$ sudo gluster volume status volume1 Status of volume: volume1 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick node1:/srv/gluster/brick1 49152 0 Y 2143 Brick node2:/srv/gluster/brick1 49153 0 Y 2311 Self-heal Daemon on node1 N/A N/A Y 2202 Self-heal Daemon on node2 N/A N/A Y 2370
Replace volume1 with the target volume name. Online should be Y for each expected brick and support daemon.
Related: How to check GlusterFS volume status - Check free space on each brick filesystem.
$ df -h /srv/gluster/brick1 Filesystem Size Used Avail Use% Mounted on /dev/sdb1 1.8T 1.1T 640G 64% /srv/gluster/brick1
- Check inode usage on each brick filesystem.
$ df -i /srv/gluster/brick1 Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sdb1 122093568 512340 121581228 1% /srv/gluster/brick1
A brick filesystem that reaches 100% space or inode use can block client writes and prevent heal or rebalance work from completing.
- Check the self-heal backlog for the volume.
$ sudo gluster volume heal volume1 info summary Brick node1:/srv/gluster/brick1 Status: Connected Number of entries: 0 Brick node2:/srv/gluster/brick1 Status: Connected Number of entries: 0
Replica and disperse volumes should trend back toward 0 entries after the cluster catches up. A count that stays nonzero needs a heal review.
Related: How to heal a GlusterFS volume - Check split-brain entries on replica or disperse volumes.
$ sudo gluster volume heal volume1 info split-brain Brick node1:/srv/gluster/brick1 Number of entries: 0 Brick node2:/srv/gluster/brick1 Number of entries: 0
Any split-brain entry means file versions diverged across bricks and should not be treated as a routine backlog.
Related: How to check for split-brain in GlusterFS - Check rebalance status after adding, removing, or replacing bricks.
$ sudo gluster volume rebalance volume1 status Node Rebalanced-files size scanned failures status --------- ---------------- -------- ---------- --------- --------- node1 182 12.3GB 182 0 completed node2 181 12.2GB 181 0 completedfailures should stay at 0 and status should reach completed before the brick-change work is considered finished.
Related: How to rebalance a GlusterFS volume - Check geo-replication workers when secondary replication is enabled.
$ sudo gluster volume geo-replication status PRIMARY NODE PRIMARY VOL PRIMARY BRICK SECONDARY USER SECONDARY SECONDARY NODE STATUS CRAWL STATUS LAST_SYNCED node1 volume1 /srv/gluster/brick1 geoaccount node4.example.net::volume1-dr node4 Active Changelog Crawl 2026-06-16 08:41:22 node2 volume1 /srv/gluster/brick1 geoaccount node4.example.net::volume1-dr node4 Active Changelog Crawl 2026-06-16 08:41:20
Active workers with recent LAST_SYNCED values indicate the secondary is receiving changes.
Related: How to check GlusterFS geo-replication status - Review recent glusterd warning and error logs.
$ sudo journalctl --unit=glusterd.service --priority=warning..alert --since "1 hour ago" --no-pager -- No entries --
GlusterFS file logs are commonly under /var/log/glusterfs, including /var/log/glusterfs/glusterd.log, /var/log/glusterfs/glustershd.log, and brick logs under /var/log/glusterfs/bricks.
Related: How to check GlusterFS logs
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.