gpfs_prealloc() subroutine

Preallocates storage for a regular file or preallocates directory entries for a directory.

Library

GPFS Library (libgpfs.a for AIX®, libgpfs.so for Linux®)

Synopsis

#include <gpfs.h>
int gpfs_prealloc(gpfs_file_t fileDesc,
                  gpfs_off64_t startOffset,
                  gpfs_off64_t bytesToPrealloc);

Description

The gpfs_prealloc() subroutine preallocates disk storage for a file or a directory.

In the case of a regular file, preallocation can improve I/O performance by creating a block of available disk storage in the file immediately instead of increasing the file size incrementally as data is written. The preallocated disk storage begins at the specified offset in the file and extends for the specified number of bytes, rounded up to a GPFS block boundary. Existing data is not modified. Reading any of the preallocated blocks returns zeroes. To determine how much space was actually preallocated, call the stat() subroutine and compare the reported file size and number of blocks used with their values before the preallocation.

In the case of a directory, preallocation can improve metadata performance by setting the minimum compaction size of the directory. Preallocation is most effective in systems in which many files are added to and removed from a directory in a short time. The minimum compaction size is the number of directory slots, including both full and empty slots, that a directory is allowed to retain when it is compacted. In IBM Storage Scale 4.1 or later versions, by default, a directory is automatically compacted as far as possible.

The gpfs_prealloc() subroutine sets the minimum compaction size of a directory to the specified number of slots and adds directory slots if needed to reach the minimum size. For example, if a directory contains 5,000 files and you set the minimum compaction size to 50,000, then the file system adds 45,000 directory slots. The directory can grow beyond 50,000 entries, but the file system does not allow the directory to be compacted below 50,000 slots.

You must specify the minimum compaction size as a number of bytes. Determine the number of bytes to allocate with the following formula:
bytesToPrealloc = n * ceiling((namelen + 13) / 32) * 32
where:
n
Specifies the number of directory entries that you want.
ceiling()
Is a function that rounds a fractional number up to the next highest integer. For example, ceiling(1.03125) returns 2.
namelen
Specifies the expected average length of file names.
For example, if you want 20,000 entries with an average file name length of 48, then bytesToPrealloc = 20000 * 2 * 32 = 1,280,000.

To restore the default behavior of the file system for a directory, in which a directory is compacted as far as possible, call the subroutine with bytesToPrealloc set to 0.

The number of bytes that are allocated for a directory is stored in the ia_dirminsize field in the gpfs_iattr64_t structure that is returned by the gpfs_fstat_x() subroutine. To convert the number of bytes to the number of directory slots, apply the following rule:
numSlots = numBytesReturned / 32
To convert the number of bytes to the number of files in the directory, apply a version of the formula that is described above:
numFiles = numBytesReturned / (ceiling(namelen + 13) * 32)
If the file is not a directory or is a directory with no preallocation, the ia_dirminsize field is set to 0. This attribute is reported for information only and is meaningful only in the active file system. It is not backed up and restored and it is ignored in snapshots. It usually differs from the file size that is recorded in ia_size or returned by stat().

To set the minimum compaction size of a directory from the command line, set the compact parameter in the mmchattr command.

Note: Compile any program that uses this subroutine with the -lgpfs flag from the following library:
  • libgpfs.a for AIX
  • libgpfs.so for Linux

Parameters

fileDesc
The file descriptor returned by open(). Note the following points:
  • A file must be opened for writing.
  • A directory can be opened for reading, but the caller must have write permission to the directory.
startOffset
For a file, the byte offset into the file at which to begin preallocation. For a directory, set this parameter to 0.
bytesToPrealloc
The number of bytes to be preallocated. For a file, this value is rounded up to a GPFS block boundary. For a directory, calculate the number of bytes with the formula that is described above.

Exit status

If the gpfs_prealloc() subroutine is successful, it returns a value of 0.

If the gpfs_prealloc() subroutine is unsuccessful, it returns a value of -1 and sets the global error variable errno to indicate the nature of the error. If errno is set to one of the following, some storage may have been preallocated:
  • EDQUOT
  • ENOSPC

The pre-allocation or minimum compaction setting of a directory can be obtained using the gpfs_fstat_x() subroutine. The gpfs_iattr64 structure it returns, defined in gpfs.h, is extended to include ia_dirminsize giving the pre-allocation size of a directory. For non-directories or directories without a pre-allocation set the value is zero. The size is expressed in bytes, which is interpreted in the same way as for the gpfs_prealloc() function. This will be 32 times the value reported by mmlsattr and will generally differ from the file size reported in ia_size and by stat().

Exceptions

None.

Error status

Error codes include but are not limited to the following:

EACCES
The file or directory is not opened for writing.
EBADF
The file descriptor is not valid.
EDQUOT
A disk quota has been exceeded
EINVAL
The file descriptor does not refer to a file or directory; a negative value was specified for startOffset or bytesToPrealloc.
ENOSPC
The file system has run out of disk space.
ENOSYS
The gpfs_prealloc() subroutine is not supported under the current file system format.

Examples

#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <gpfs.h>

int rc;
int fileHandle = -1;
char* fileNameP = "datafile";
offset_t startOffset = 0;
offset_t bytesToAllocate = 20*1024*1024;  /* 20 MB */

fileHandle = open(fileNameP, O_RDWR|O_CREAT, 0644);
if (fileHandle < 0)
 {
   perror(fileNameP);
   exit(1);
 }

rc = gpfs_prealloc(fileHandle, startOffset, bytesToAllocate); 
if (rc < 0)
 {
   fprintf(stderr, "Error %d preallocation at %lld for %lld in %s\n",
             errno, startOffset, bytesToAllocate, fileNameP);
   exit(1);
 } 

Location

/usr/lpp/mmfs/lib/libgpfs.a for AIX

/usr/lpp/mmfs/lib/libgpfs.so for Linux