The O2CB heartbeat and services stack

What makes OCFS2 unique is its stack of five file system services, named O2CB.

O2CB manages the shared file access within a cluster of servers, and although these services can be transparent to a user after setup, the O2CB stack must be loaded and brought online at system restart time. An OCFS2 file looks like any other file.

O2CB has these five services:
Node manager (NM)
Keeps track of all nodes in the cluster.
Heart beat (HB)
Issues node up and node down notification when nodes join or leave the cluster.
TCP
Provides communication between the nodes.
Distributed Lock Manager (DLM) and Distributed Lock Manager File System (DLMFS)
These two services ensure the consistency of the clustered file system:
DLM
Keeps track of all locks, lock holders, and lock status.
DLMFS
Provides the user space interface to DLM in the Linux® kernel.
CONFIGFS
Performs filesystem-based management of kernel objects, known as config_items. The mount point is directory /config.

The O2CB heartbeat is created by cluster members writing to a shared control area at regular intervals, using the designated interconnect, as proof that the node is alive. The default heartbeat rate is one heartbeat every two seconds. Even in a clustered file system with only one node, the heartbeat is maintained.

The Node Manager coordinates the heartbeat processes and responds when nodes join, drop off, lose communication, or change status, whether intentional or not, within the cluster. The Node Manager can trigger recovery. A key function of the Node Manager is to make it possible to free resources in the event of a node dropping off.

The Distributed Lock System is the key to OCFS2's coordination of shared access to storage by multiple nodes. The DLS is a hierarchical locking system that allows enqueueing for resources from high level to granular. Locking is administered by the DLM.