Planning for the Highly Available Write Cache feature (HAWC)
Learn about the Highly Available Write Cache feature (HAWC).
Components that HAWC interacts with
HAWC interacts with several fundamental components of IBM Storage Scale. You might want to review these components before you read about HAWC.
- System storage pool
- The system storage pool or system pool is a required storage pool that
contains information that IBM
Storage Scale uses to
manage a file system. Each file system has only one system storage pool, which is automatically
created when the file system is created. The system storage pool contains the following types of information:
- Control information (such as file system control structures, reserved files, directories, symbolic links, special devices)
- The metadata associated with regular files, including indirect blocks and extended attributes
- Regular file data, if the
usage=dataAndMetadata
option is set in the NSD stanza for a system storage pool NSD - The file system recovery logs (default location)
- System.log storage pool
- The system.log storage pool is an optional dedicated storage pool that contains only the file system recovery logs. If you define this pool, then IBM Storage Scale uses it for all the file system recovery logs of the file system. Otherwise, the file system recovery logs are kept in the system storage pool. It is a good practice for the system.log pool to consist of storage media that is as fast as or faster than the storage media of the system storage pool. If the storage is nonvolatile, this pool can be used for the high-availability write cache (HAWC).
- File system recovery logs
- A file system recovery log is a write-ahead log or journal of I/O metadata that
describes pending write operations for a node of a file system. In IBM
Storage Scale, it is also sometimes referred to as the
recovery log, the GPFS log, or the IBM
Storage Scale log. IBM
Storage Scale creates and maintains a separate recovery log
for every node that mounts a file system. Recovery logs are stored in the system storage pool by
default or in the system.log storage pool if one is defined. The recovery logs can be read by any
node that mounts the file system. If a node is unexpectedly shut down while write operations are
pending for one of its hard disks, IBM
Storage Scale can
read the recovery log for the failed node and restore the file system to a consistent state. The
recovery can occur immediately, without having to wait for the failed node to return.
The recovery logs are also used by HAWC to temporarily store HAWC write data and metadata.
- Page pool
- The page pool is an area of pinned memory (memory that is never paged to disk) that contains file data and metadata associated with in-progress I/O operations. When IBM Storage Scale processes a file write operation, the first step is putting the write data and metadata for the write operation into the page pool. At an appropriate time, another thread writes the data to a hard disk and removes it from the page pool.
HAWC operation
The high-availability write cache is a disk-caching component that includes caching software and nonvolatile storage. HAWC also uses the file system recovery logs, in which the file system records metadata about its pending write operations. For HAWC purposes, the recovery logs must be located in nonvolatile storage.
When HAWC is not active, the write data is copied from the page pool entry and written to the file on hard disk. If the write operation is synchronous, the system notifies the write operation that the write is successful and it returns to its caller.
- If the write operation is synchronous and the size of the file data is less than or equal to the write data threshold, HAWC copies the file data from the page pool entry into the recovery log, along with any I/O metadata that is required for recovery. The write data threshold variable is set by the mmcrfs command or the mmchfs command. Next HAWC notifies the original write operation that the file data is successfully written to hard disk. In fact, the file data is not written to hard disk yet, although it is preserved in the recovery log as a backup. HAWC then starts a write-behind thread that eventually writes the file data to the hard disk. When the data is safely written, HAWC purges the file data and I/O metadata from the recovery log, because it is no longer needed.
-
If the write operation is not synchronous or if the size of the write data is greater than the write cache threshold, then the write data follows the same path that is followed when HAWC is not active. The system copies the write data from the page pool entry and writes it to hard disk. If the original write operation is synchronous, the system notifies it that the file data is safely written to the hard disk.
HAWC improves the performance of small synchronous write operations in two ways. First, it allows synchronous write operations to return the calling application as soon as the write data is written into the recovery log. The calling application does not have to wait for the much lengthier process of writing the data to hard disk. Second, the HAWC caching software can consolidate small sequential writes into one larger write. This consolidation eliminates all but one of the initial disk seeks that is required if the data is written as multiple writes.
The write-cache threshold variable can be adjusted by specifying a value for the --write-cache-threshold parameter of the mmchfs command. The valid range is 0 - 64 K in multiples of 4 K. You can also set this variable when you create the file system by specifying the same parameter in the mmcrfs command. Setting the write cache threshold to zero disables HAWC. You can update the write threshold variable at any time; the file system does not have to be mounted on the node.
HAWC storage scenarios

In this scenario, when a synchronous write operation arrives at a node, the file data and metadata are copied a page pool entry in the usual way. If the size of the file data is less than the write data threshold, HAWC copies the file data into the recovery log along with any I/O metadata that is required for recovery. Next, HAWC returns an acknowledgment to the write operation that indicates that the file data is successfully written to hard disk. HAWC then starts a write-behind thread that eventually writes the file data to the hard disk. When the data is safely written, HAWC purges the file data and I/O metadata for the write operation from the recovery log.

HAWC software configuration
- Stop the GPFS daemon on all the nodes of the cluster.
- Create NSD stanzas for the nonvolatile storage devices. In the stanza, specify one storage pool for all the nonvolatile storage devices, which must be either the system pool or the system.log pool.
- Run mmcrnsd to create the NSDs.
- Run mmaddisk to add the NSDs to the file system and to create the system.log pool if necessary.
- Start the GPFS daemons on all nodes.
- Optionally, run the mmchfs command with the -L parameter to set the size of the recovery logs to a non-default value.
- Optionally, run the mmchfs command with the --log-replicas parameter to set the number of replicas of the recovery log to a non-default value. This option is applicable only if the recovery logs are stored in the system.log pool.
- To activate HAWC, run the mmchfs command with the --write-cache-threshold parameter set to a nonzero value.
HAWC is now active.