IBM Support

IBM Spectrum Scale 5.1.1 levels Alert: possible undetected data corruption when an application writes to pre-allocated file blocks using direct I/O mode.

Flashes (Alerts)


Abstract

IBM has identified a problem with the IBM Spectrum Scale 5.1.1 code level (5.1.1.0 to 5.1.1.3; ESS 6.1.1.0 to 6.1.1.2), which can result in undetected data corruption when an application uses the direct I/O mode to write to a pre-allocated file block.

Content

Problem Summary:
The fallocate (Linux specific) system call and gpfs_prealloc API allow the caller to allocate disk space for the file. In order to process these requests, IBM Spectrum Scale allocates data blocks for the file and sets "phantom bits" in the inode or indirect block for these data blocks to mark them as uninitialised.
A similar situation can occur on an FPO file system. When writing a file in the FPO file system, data blocks are allocated in units of blockGroupFactor in size. These blocks in the blockGroupFactor sized unit that have not been covered by the write will be marked as uninitialised by setting their "phantom bits".
In this alert, blocks with such a phantom bit are referred to as pre-allocated file blocks.
If an application uses direct I/O mode (O_DIRECT) to write a file with pre-allocated file blocks, and if the following conditions are met, the "phantom bits" of the pre-allocated file blocks will not be cleared properly. If the application attempts to read the data, a sequence of zeroes, instead of the data, which has just been written will be returned. This results in undetected data corruption.
Users Affected:

Users may be affected if all conditions below are true :

1) The file is replicated (the number of data replicas of the file is greater than 1)
2) Application writes to the file that uses direct I/O mode.
3) The direct I/O write size is large enough to cover at least three blocks, of which the beginning and ending blocks are normal blocks (not holes or pre-allocated blocks).
Example:
Assuming that the file system block size is 1MB.
// Pre-allocates 10 full blocks for the file
fallocate(fd, 0, 0, 10485760);
close(fd);
// Open file that uses direct I/O mode
fd = open(file, O_RDWR|O_DIRECT);
// Write 512 bytes to the beginning of the first block.
// This write is fine, it will clear the phantom bit of the first pre-allocated block and turn it into a normal block
write(fd, buf, 512);
// Write 512 bytes to the beginning of the third block.
// This write is fine, it will clear the phantom bit of the third pre-allocated block and turn it into a normal block
leek(fd, 1048576 * 2, SEEK_SET);
write(fd, buf, 512);
// Write 2097152 bytes
// This write covers the 1st, 2nd and 3rd blocks. The 1st and 3rd blocks are normal blocks and the 2nd block is a pre-allocated block.
// After the write, the phantom bit of the 2nd block is not properly cleared and causes data loss.
leek(fd, 1024, SEEK_SET);
write(fd, buf, 1048576 * 2);

[{"Type":"MASTER","Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"ARM Category":[{"code":"a8m50000000KzgwAAC","label":"File System"}],"ARM Case Number":"","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"5.1.1"},{"Type":"MASTER","Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"STHMCM","label":"IBM Elastic Storage Server"},"ARM Category":[{"code":"a8m50000000KzeRAAS","label":"File system corruption"}],"Platform":[{"code":"PF016","label":"Linux"}],"Version":"6.1.1"}]

Document Information

Modified date:
20 October 2021

UID

ibm16495047