Using HAWC

Learn how to enable HAWC, set up storage for the recovery log, and do administrative tasks.

Enabling HAWC

To enable HAWC, set the write cache threshold for the file system to a value that is a multiple of 4 KB and in the range 4 KB - 64 KB. The following example shows how to set the threshold for an existing file system:
mmchfs gpfsA --write-cache-threshold 32K
The following example shows how to specify the threshold when you create a new file system:
mmcrfs /dev/gpfsB -F ./deiskdef2.txt -B1M --write-cache-threshold 32K -T /gpfs/gpfsB 

After HAWC is enabled, all synchronous write requests less than or equal to the write cache threshold are put into the recovery log. The file system sends a response to the application after it puts the write request in the log. If the size of the synchronous write request is greater than the threshold, the data is written directly to the primary storage system in the usual way.

Setting up the recovery log in fast storage

Proper storage for the recovery log is important to improve the performance of small synchronous writes and to ensure that written data survives node or disk failures. Two methods are available:
Method 1: Centralized fast storage
In this method, the recovery log is stored on a centralized fast storage device such as a storage controller with SSDs, a flash system, or an IBM Elastic Storage™ Server (ESS) with SSDs.

You can use this configuration on any storage that contains the system pool or the system.log pool. The faster that the metadata pool is compared to the data storage, the more using HAWC can help.

Method 2: Distributed fast storage in client nodes
In this method, the recovery log is stored on IBM Spectrum Scale™ client nodes on local fast storage devices, such as SSDs, NVRAM, or other flash devices.

The local device NSDs must be in the system.log storage pool. The system.log storage pool contains only the recovery logs.

It is a good idea to enable at least two replicas of the system.log pool. Local storage in an IBM Spectrum Scale node is not highly available, because a node failure makes the storage device inaccessible.

Use the mmchfs command with the --log-replicas parameter to specify a replication factor for the system.log pool. This parameter, with the system.log capability, is intended to place log files in a separate pool with replication different from other metadata in the system pool.

You can change log replication dynamically by running the mmchfs command followed by the mmrestripefs command. However, you can enable log replication only if the file system was created with a number of maximum metadata replicas of 2 or 3. (See the -M option of the mmcrfs command .)

Administrative tasks

Learn how to do the following administrative tasks with HAWC:
Restriping after you add or remove a disk
As with any other pool, after you add or remove a disk from the system.log pool, run the mmrestripefs -b command to rebalance the pool.

Preparing for a node or disk failure in the system.log pool
  • If the system.log is replicated, you can run the following command to ensure that data is replicated automatically after a node or disk fails:
    mmchconfig restripeOnDiskFailure=yes -i
  • You can run the following command to set how long the file system waits to start a restripe after a node or disk failure:
    mmchconfig metadataDiskWaitTimeForRecovery=Seconds
    where Seconds is the number of seconds to wait. This setting helps to avoid doing a restripe after a temporary outage such as a node rebooting. The default time is 300 seconds.

Adding HAWC to an existing file system
Follow these steps:
  1. If the metadata pool is not on a fast storage device, migrate the pool to a fast storage device. For more information, see Managing storage pools.
  2. Increase the size of the recovery log to at least 128 MB. Enter the following command:
    mmchfs Device -L LogFileSize
    where LogFileSize is the size of the recovery log. For more information, see the topic mmchfs command.
  3. Enable HAWC by setting the write cache threshold, as described earlier in this topic.