lvdd Special File

Purpose

Provides access to the logical volume device driver.

Description

The logical volume device driver provides character (raw) access to logical volumes. The Logical Volume Manager associates a major number with each volume group (VG) and a minor number with each logical volume in a volume group.

Logical volume special file names can be assigned by the administrator of the system. However, /dev/lv1, /dev/lv2 and /dev/rlv1, /dev/rlv2 are the names conventionally chosen.

When performing character I/O, each request must start on a logical block boundary of the logical volume. The logical block size of the logical volume is the block size of the physical volume within the volume group. This means that for character I/O to a logical volume device, the offset supplied to the lseek subroutine must specify a multiple of logical block size. In addition, the number of bytes to be read or written, supplied to the read or write subroutine, must be a multiple of the logical block size.

Usage Considerations

Note: Data corruption, loss of data, or loss of system integrity (system crashes) will occur if devices supporting paging, logical volumes, or mounted file systems are accessed using block special files. Block special files are provided for logical volumes and disk devices on the operating system and are solely for system use in managing file systems, paging devices and logical volumes. They should not be used for other purposes. Additional information concerning the use of special files may be obtained in "Understanding I/O Access through Special Files" in Kernel Extensions and Device Support Programming Concepts.

open and close Subroutines

No special considerations.

Extension Word Specification for the readx and writex Subroutines

The ext parameter for the readx and writex extended I/O subroutines indicates specific physical or logical operations, or both. The upper 4 bits of the ext parameter are reserved for internal LVDD use. The value of the ext parameter is defined by logically ORing values from the following list, as defined in the /usr/include/sys/lvdd.h file:

Item Description
WRITEV Perform physical write verification on this request. This operation can be used only with the writex subroutine.
RORELOC For this request, perform relocation on existing relocated defects only. Newly detected defects should not be relocated.
MWC_RCV_OP Mirror-write-consistency recovery operation. This option is used by the recovery software to make consistent all mirrors with writes outstanding at the time of the crash.
NOMWC Inhibit mirror-write-consistency recovery for this request only. This operation can only be used with the writex subroutine.
AVOID_C1, AVOID_C2, AVOID_C3 For this request, avoid the specified mirror. This operation can only be used with the readx subroutine.
RESYNC_OP For this request, synchronize the specified logical track group (LTG). This operation can only be used with the readx subroutine and must be the only operation. When synchronizing a striped logical volume, the data returned is not usable by the application because the logical track group is not read on a striped basis.
LV_READ_BACKUP Read only the mirror copy that is designated as the backup mirror copy.
LV_WRITE_BACKUP Write only the mirror copy that is designated as the backup mirror copy.
LV_READ_ONLY_C1 Read only copy one of the data.
LV_READ_ONLY_C2 Read only copy two of the data.
LV_READ_ONLY_C3 Read only copy three of the data.
LV_READ_STALE_C1 Read only copy one of the data even if it is stale.
LV_READ_STALE_C2 Read only copy two of the data even if it is stale.
LV_READ_STALE_C3 Read only copy three of the data even if it is stale.

There are some restrictions when using this operation. To synchronize a whole logical partition (LP), a series of readx subroutines using the RESYNC_OP operation must be done. The series must start with the first logical track group (LTG) in the partition and proceed sequentially to the last LTG. Any deviation from this will result in an error. The length provided to each readx operation must be exactly 128KB (the LTG size).

Normal I/O can be done concurrently anywhere in the logical partition while the RESYNC_OP is in progress. If an error is returned, the series must be restarted from the first LTG. An error is returned only if resynchronization fails for every stale physical partition copy of any logical partition. Therefore, stale physical partitions are still possible at the end of synchronizing an LP.

Normal I/O operations do not need to supply the ext parameter and can use the read and write subroutines.

IOCINFO ioctl Operation

IOCINFO ioctl operation returns the devinfo structure, which is defined in the /usr/include/sys/devinfo.h file. The logical block size of the logical volume is the block size of the physical volume within the volume group. The values returned in this structure are defined as follows for requests to the logical volume device driver:

Item Description
devtype Equal to DD_DISK (as defined in the devinfo.h file)
flags Equal to DF_RAND
devsubtype Equal to DS_LV or DS_LVZ. The DS_LVZ devsubtype indicates that the logical volume control block will not occupy the first block of the logical volume, therefore, the space is available for application data. For oldvg format volume groups, the devsubtype of a logical volume is always DS_LV. For bigvg format volume groups, the devsubtype of a logical volume will be DS_LVZ if mklv -T 0 was used to create the logical volume. For scalable format volume groups, the devsubtype of a logical volume is always DS_LVZ (regardless of whether or not the mklv -T 0 flag was used to create the logical volume).
bytpsec Bytes per block for the logical volume
secptrk Number of blocks per logical track group
trkpcyl Number of logical track groups per partition
numblks Number of logical blocks in the logical volume

XLATE ioctl Operation

The XLATE ioctl operation translates a logical address (logical block number and mirror number) to a physical address (physical device and physical block number on that device). The caller supplies the logical block number and mirror number in the xlate_arg structure, which is defined in the /usr/include/sys/lvdd.h file. The logical block size of the logical volume is the block size of the physical volume within the volume group. This structure contains the following fields:

Item Description
lbn Logical block number to translate
mirror The number of the copy for which to return a pbn (physical block number on disk). Possible values are:
1 Copy 1 (primary)
2 Copy 2 (secondary)
3 Copy 3 (tertiary)
p_devt Physical dev_t (major/minor number of the disk)
pbn Physical block number on disk

XLATE64 ioctl Operation

The XLATE64 ioctl operation functions the same as the XLATE operation except that it uses the xlate_arg64 structure, in which the logical and physical block numbers and the device (major/minor) number fields are 64-bit wide.

PBUFCNT ioctl Operation

The PBUFCNT ioctl operation increases the size of the physical buffer header, pbuf, pool that is used by LVM for logical-to-physical request translation. The size of this pool is determined by the number of active disks in the system, although the pool is shared for request to all disks.

The PBUFCNT ioctl operation can be issued to any active volume group special file, for example /dev/VolGrpName. The parameter passed to this ioctl is a pointer to an unsigned integer that contains the pbufs-per-disk value. The valid range is 16 - 128. The default value is 16. This value can only be increased and is reset to the default at IPL. The size of the pbuf pool is not reduced when the number of active disks in the system is decreased.

The PBUFCNT ioctl operation returns the following:

Item Description
EINVAL Indicates an invalid parameter value. The value is larger than the maximum allowed, or smaller than or equal to the current value.
EFAULT Indicates that the copy in of the parameter failed.
LVDD_ERROR An error occurred in allocating space for additional buffer headers.
LVDD_SUCCESS Indicates a successful ioctl operation.

LV_INFO ioctl Operation

The LV_INFO ioctl operation returns information about the logical volume.

The caller supplies the logical volume special file in the system open call and the information is returned via the lv_info structure, as defined in the /usr/include/sys/lvdd.h file. This structure contains the following fields:

Item Description
vg_id Volume group ID of which the logical volume is a member
major_num Major number of logical volume
minor_num Minor number of the logical volume
max_lps Maximum number of logical partitions allowed for this logical volume
current_lps Current size of the lofical volume in terms of logical partitions
mirror_policy Specifies the type of mirroring, if the logical volume is mirrored. Valid values are parallel, sequential, striped, and striped_parallel.
permissions Specifies whether the logical volume is read only or read-write
bb_relocation Specifies whether bad block relocation is activated for the logical volume
write_verify Specifies whether the write verify command for writes to the logical volume is enforced
num_blocks Number of logical blocks that form the logical volume. The logical block size of the logical volume is the block size of the physical volume within the volume group. This value does not include mirrored logical volumes.
mwcc Specifies which mirrored write consistency check algorithm is set, if it is active.
MWCC_NON_ACTIVE
mwcc disabled for this logical volume
MWCC_ACTIVE_MODE
ACTIVE mwcc algorithm set for this logical volume
MWCC_PASSIVE_MODE
PASSIVE mwcc algorithm set for this logical volume
MWCC_PASSIVE_RECOVERY
logical mirrors undergoing PASSIVE mwcc recovery after system interruption
mirr_able Specifies whether the logical volume is capable of being mirrored
num_mirrors Number of mirror copies for this logical volume
striping_width Number of drives across which this logical volume is striped
stripe_exp Stripe block exponent value
backup_mirror Backup mirror mask will be zero indicating there is not a backup copy active.
AVOID_C1
For the first copy
AVOID_C2
For the second copy
AVOID_C3
For the third copy.

The LV_INFO ioctl operation returns the following:

Item Description
EFAULT Indicates that the copy of the parameter failed.

LVM ioctl Operations Used to Modify Single Logical Volumes

Item Description
LV_QRYBKPCOPY Query for designated backup mirror copy.
LV_SETBKPCOPY Designate backup mirror copy.
LV_FSETBKPCOPY Force new designation for backup mirror copy. Used when there are stale partitions on either the active mirror or backup mirror.
SET_SYNC_ON_RD Causes the logical volume to go into MWCC_PASSIVE_RECOVERY mode. All reads from one mirror copy will cause non-read mirror copies to undergo a sync write.
CLR_SYNC_ON_RD Clears the MWCC_PASSIVE_RECOVERY mode of the logical volume, if it exists. This clear should not be exercised if mirror consistency is not guaranteed.

LV_INFO64 ioctl Operation

The LV_INFO64 ioctl operation functions the same as the LV_INFO operation except that it uses the lv_info64 structure, in which the major_num and minor_num fields are 32-bit wide each and the num_blocks field is 64-bit wide.

LVM_CFG_ASSIST ioctl Operation

The LVM_CFG_ASSIST ioctl operation returns the performance statistics of a logical volume. It returns the cfg_assist structure, which is defined in the /usr/include/sys/lvdd.h file. This structure contains the following fields:
Item Description
throughput Average throughput of the disks in the logical volume in KB/sec. For supported storage devices, throughput is obtained from the device; otherwise, runtime throughput of the logical volume is returned.
latency Average latency of the disks in the logical volume in milliseconds. For supported storage devices, throughput is obtained from the device; otherwise, runtime latency of the logical volume is returned.
flags Flags to be used. For a list of valid flags, see the /usr/include/sys/lvdd.h file.
vg_max_transfer The maximum transfer size of the volume group (VG), in KB. The vg_max_transfer field value is the maximum amount of data that can be transferred in one I/O request to the disks of the volume group.
write_atomicity Write atomicity in Bytes. The write_atomicity field value is the largest number of bytes that are not broken up when they are written on aligned boundaries.
The LVM_CFG_ASSIST ioctl operation returns the following parameters only for supported storage devices; otherwise, it returns a null value.
Item Description
atomicWriteAlignment Required alignment for write atomicity in KB.
ideal_sequential_read_size Ideal, sequential, read size of the disks under the file system in KB.
ideal_sequential_write_size Ideal, sequential, write size of the disks under the file system in KB.
ideal_random_read_size Ideal, random, read size of the disks under the file system in KB.
ideal_random_write_size Ideal, random, write size of the disks under the file system in KB.
stripsize Strip size of the disks under the file system in KB. This is the amount of data that is contiguous on a single spindle in the raid array.
stripesize Stripesize is in KB. (Stripesize = stripsize x num spindles in raid array - parity.)
parallelism Number of spindles that comprise the RAID device that can be concurrently read from, and written to, in parallel.

Return Values

When you complete this operation, a value of 0 is returned. If the operation fails, a value of -1 is returned and the errno global variable is set to one of the following values:
Item Description
EFAULT Indicates that the copy of the parameter failed.
ENOMEM Indicates that the allocation of the memory failed.
EAGAIN Indicates that the runtime statistics are unavailable for any of the physical volumes in the logical volume. Try again after more I/O has been issued to the logical volume.

FORCEOFF_VG ioctl Operation

You can force a volume group offline by the FORCEOFF_VG ioctl operation. You can issue this operation to any active volume group special file, for example the /dev/VolGrpName file. The parameter passed to this ioctl is a pointer to an integer that contains the value FORCE_VG_OFF as defined in the /usr/include/sys/lvdd.h file.

When you force the volume group offline, subsequent logical volume opens, and I/O requests and changes for volume group configuration fail. You must vary off the volume group and vary it on again to clear the forced offline condition.

If this operation is completed, a value of 0 is returned. Otherwise, a value of -1 is returned and the errno global variable is set to one of the following values:
Item Description
EFAULT Indicates that the copy-in of the parameter fails.
EINVAL Indicates the parameter value is not valid.

Error Codes

In addition to the possible general errors returned by the ioctl subroutine, the following errors can also be returned from specific ioctl operation types.

Item Description
ENXIO The logical volume does not exist. (This error type is relevant to the IOCINFO, XLATE ioctl, and XLATE64 operations.)
ENXIO The logical block number is larger than the logical volume size. (This error type is relevant only to the XLATE ioctl and XLATE64 ioctl operations.)
ENXIO The copy (mirror) number is less than 1 or greater than the number of actual copies. (This error type is relevant only to the XLATE ioctl and XLATE64 ioctl operations.)
ENXIO No physical partition has been allocated to this copy (mirror). (This error type is relevant only to the XLATE ioctl and XLATE64 ioctl operations.)