Using HAWC
Learn how to enable HAWC, set up storage for the recovery log, and do administrative tasks.
Enabling HAWC
mmchfs gpfsA --write-cache-threshold 32K
The following
example shows how to specify the threshold when you create a new file
system:mmcrfs /dev/gpfsB -F ./deiskdef2.txt -B1M --write-cache-threshold 32K -T /gpfs/gpfsB
After HAWC is enabled, all synchronous write requests less than or equal to the write cache threshold are put into the recovery log. The file system sends a response to the application after it puts the write request in the log. If the size of the synchronous write request is greater than the threshold, the data is written directly to the primary storage system in the usual way.
Setting up the recovery log in fast storage
- Method 1: Centralized fast storage
- In this method, the recovery log is stored on a centralized fast storage device such as a storage controller with SSDs, a flash system, or an IBM Elastic Storage™ Server (ESS) with SSDs.
You can use this configuration on any storage that contains the system pool or the system.log pool. The faster that the metadata pool is compared to the data storage, the more using HAWC can help.
- Method 2: Distributed fast storage in client nodes
- In this method, the recovery log is stored on IBM Spectrum Scale™ client nodes on local fast storage devices, such
as SSDs, NVRAM, or other flash devices.
The local device NSDs must be in the system.log storage pool. The system.log storage pool contains only the recovery logs.
It is a good idea to enable at least two replicas of the system.log pool. Local storage in an IBM Spectrum Scale node is not highly available, because a node failure makes the storage device inaccessible.
Use the mmchfs command with the --log-replicas parameter to specify a replication factor for the system.log pool. This parameter, with the system.log capability, is intended to place log files in a separate pool with replication different from other metadata in the system pool.
You can change log replication dynamically by running the mmchfs command followed by the mmrestripefs command. However, you can enable log replication only if the file system was created with a number of maximum metadata replicas of 2 or 3. (See the -M option of the mmcrfs command .)
Administrative tasks
- Restriping after you add or remove a disk
- As with any other pool, after you add or remove a disk from the system.log
pool, run the mmrestripefs -b command to rebalance the pool.
- Preparing for a node or disk failure in the system.log pool
- If the system.log is replicated, you can run the following command to ensure
that data is replicated automatically after a node or disk
fails:
mmchconfig restripeOnDiskFailure=yes -i
- You can run the following command to set how long the file system waits to start a restripe
after a node or disk
failure:
where Seconds is the number of seconds to wait. This setting helps to avoid doing a restripe after a temporary outage such as a node rebooting. The default time is 300 seconds.mmchconfig metadataDiskWaitTimeForRecovery=Seconds
- If the system.log is replicated, you can run the following command to ensure
that data is replicated automatically after a node or disk
fails:
- Adding HAWC to an existing file system
- Follow these steps:
- If the metadata pool is not on a fast storage device, migrate the pool to a fast storage device. For more information, see Managing storage pools.
- Increase the size of the recovery log to at least 128 MB. Enter the following command:
where LogFileSize is the size of the recovery log. For more information, see the topic mmchfs command.mmchfs Device -L LogFileSize
- Enable HAWC by setting the write cache threshold, as described earlier in this topic.