Glossary

This glossary provides terms and definitions for IBM Storage Scale.

The following cross-references are used in this glossary:

See refers you from a nonpreferred term to the preferred term or from an abbreviation to the spelled-out form.
See also refers you to a related or contrasting term.

For other terms and definitions, see the IBM® Terminology website (opens in new window).

B

block utilization: The measurement of the percentage of used subblocks per allocated blocks.

C

cluster: A loosely coupled collection of independent systems (nodes) organized into a network for the purpose of sharing resources and communicating with each other. See also GPFS cluster.
cluster configuration data: The configuration data that is stored on the cluster configuration servers.
Cluster Export Services (CES) nodes: A subset of nodes configured within a cluster to provide a solution for exporting GPFS file systems by using the Network File System (NFS), Server Message Block (SMB), and S3 protocols.
cluster manager: The node that monitors node status using disk leases, detects failures, drives recovery, and selects file system managers. The cluster manager must be a quorum node. The selection of the cluster manager node favors the quorum-manager node with the lowest node number among the nodes that are operating at that particular time.
Note: The cluster manager role is not moved to another node when a node with a lower node number becomes active.
clustered watch folder: Provides a scalable and fault-tolerant method for file system activity within an IBM Storage Scale file system. A clustered watch folder can watch file system activity on a fileset, inode space, or an entire file system. Events are streamed to an external Kafka sink cluster in an easy-to-parse JSON format. For more information, see the mmwatch command.
control data structures: Data structures needed to manage file data and metadata cached in memory. Control data structures include hash tables and link pointers for finding cached data; lock states and tokens to implement distributed locking; and various flags and sequence numbers to keep track of updates to the cached data.

D

Data Management Application Program Interface (DMAPI): The interface defined by the Open Group's XDSM standard as described in the publication System Management: Data Storage Management (XDSM) API Common Application Environment (CAE) Specification C429, The Open Group ISBN 1-85912-190-X.
deadman switch timer: A kernel timer that works on a node that has lost its disk lease and has outstanding I/O requests. This timer ensures that the node cannot complete the outstanding I/O requests (which would risk causing file system corruption), by causing a panic in the kernel.
dependent fileset: A fileset that shares the inode space of an existing independent fileset.
disk descriptor: A definition of the type of data that the disk contains and the failure group to which this disk belongs. See also failure group.
disk leasing: A method for controlling access to storage devices from multiple host systems. Any host that wants to access a storage device configured to use disk leasing registers for a lease; in the event of a perceived failure, a host system can deny access, preventing I/O operations with the storage device until the preempted system has reregistered.
disposition: The session to which a data management event is delivered. An individual disposition is set for each type of event from each file system.
domain: A logical grouping of resources in a network for the purpose of common management and administration.

E

ECKD: See extended count key data (ECKD).
ECKD device: See extended count key data device (ECKD device).
encryption key: A mathematical value that allows components to verify that they are in communication with the expected server. Encryption keys are based on a public or private key pair that is created during the installation process. See also file encryption key, master encryption key.
extended count key data (ECKD): An extension of the count-key-data (CKD) architecture. It includes additional commands that can be used to improve performance.
extended count key data device (ECKD device): A disk storage device that has a data transfer rate faster than some processors can utilize and that is connected to the processor through use of a speed matching buffer. A specialized channel program is needed to communicate with such a device. See also fixed-block architecture disk device.

F

failback: Cluster recovery from failover following repair. See also failover.
failover: (1) The assumption of file system duties by another node when a node fails. (2) The process of transferring all control of the ESS to a single cluster in the ESS when the other clusters in the ESS fails. See also cluster. (3) The routing of all transactions to a second controller when the first controller fails. See also cluster.
failure group: A collection of disks that share common access paths or adapter connections, and could all become unavailable through a single hardware failure.
FEK: See file encryption key.
fileset: A hierarchical grouping of files managed as a unit for balancing workload across a cluster. See also dependent fileset, independent fileset.
fileset snapshot: A snapshot of an independent fileset plus all dependent filesets.
file audit logging: Provides the ability to monitor user activity of IBM Storage Scale file systems and store events related to the user activity in a security-enhanced fileset. Events are stored in an easy-to-parse JSON format. For more information, see the mmaudit command.
file clone: A writable snapshot of an individual file.
file encryption key (FEK): A key used to encrypt sectors of an individual file. See also encryption key.
file-management policy: A set of rules defined in a policy file that GPFS uses to manage file migration and file deletion. See also policy.
file-placement policy: A set of rules defined in a policy file that GPFS uses to manage the initial placement of a newly created file. See also policy.
file system descriptor: A data structure containing key information about a file system. This information includes the disks assigned to the file system (stripe group), the current state of the file system, and pointers to key files such as quota files and log files.
file system descriptor quorum: The number of disks needed in order to write the file system descriptor correctly.
file system manager: The provider of services for all the nodes using a single file system. A file system manager processes changes to the state or description of the file system, controls the regions of disks that are allocated to each node, and controls token management and quota management.
fixed-block architecture disk device (FBA disk device): A disk device that stores data in blocks of fixed size. These blocks are addressed by block number relative to the beginning of the file. See also extended count key data device.
fragment: The space allocated for an amount of data too small to require a full block. A fragment consists of one or more subblocks.

G

GPUDirect Storage: IBM Storage Scale's support for NVIDIA's GPUDirect Storage (GDS) enables a direct path between GPU memory and storage. File system storage is directly connected to the GPU buffers to reduce latency and load on CPU. Data is read directly from an NSD server's pagepool and it is sent to the GPU buffer of the IBM Storage Scale clients by using RDMA.
global snapshot: A snapshot of an entire GPFS file system.
GPFS cluster: A cluster of nodes defined as being available for use by GPFS file systems.
GPFS portability layer: The interface module that each installation must build for its specific hardware platform and Linux® distribution.
GPFS recovery log: A file that contains a record of metadata activity and exists for each node of a cluster. In the event of a node failure, the recovery log for the failed node is replayed, restoring the file system to a consistent state and allowing other nodes to continue working.

I

ill-placed file: A file assigned to one storage pool but having some or all of its data in a different storage pool.
ill-replicated file: A file with contents that are not correctly replicated according to the desired setting for that file. This situation occurs in the interval between a change in the file's replication settings or suspending one of its disks, and the restripe of the file.
independent fileset: A fileset that has its own inode space.
indirect block: A block containing pointers to other blocks.
inode: The internal structure that describes the individual files in the file system. There is one inode for each file.
inode space: A collection of inode number ranges reserved for an independent fileset, which enables more efficient per-fileset functions.
ISKLM: IBM Security Key Lifecycle Manager. For GPFS encryption, the ISKLM is used as an RKM server to store MEKs.

J

journaled file system (JFS): A technology designed for high-throughput server environments, which are important for running intranet and other high-performance e-business file servers.
junction: A special directory entry that connects a name in a directory of one fileset to the root directory of another fileset.

K

kernel: The part of an operating system that contains programs for such tasks as input/output, management and control of hardware, and the scheduling of user tasks.

M

master encryption key (MEK): A key used to encrypt other keys. See also encryption key.
MEK: See master encryption key.
metadata: Data structures that contain information that is needed to access file data. Metadata includes inodes, indirect blocks, and directories. Metadata is not accessible to user applications.
metanode: The one node per open file that is responsible for maintaining file metadata integrity. In most cases, the node that has had the file open for the longest period of continuous time is the metanode.
mirroring: The process of writing the same data to multiple disks at the same time. The mirroring of data protects it against data loss within the database or within the recovery log.
Microsoft Management Console (MMC): A Windows tool that can be used to do basic configuration tasks on an SMB server. These tasks include administrative tasks such as listing or closing the connected users and open files, and creating and manipulating SMB shares.
multi-tailed: A disk connected to multiple nodes.

N

namespace: Space reserved by a file system to contain the names of its objects.
Network File System (NFS): A protocol, developed by Sun Microsystems, Incorporated, that allows any host in a network to gain access to another host or netgroup and their file directories.
Network Shared Disk (NSD): A component for cluster-wide disk naming and access.
NSD volume ID: A unique 16-digit hex number that is used to identify and access all NSDs.
node: An individual operating-system image within a cluster. Depending on the way in which the computer system is partitioned, it may contain one or more nodes.
node descriptor: A definition that indicates how GPFS uses a node. Possible functions include: manager node, client node, quorum node, and nonquorum node.
node number: A number that is generated and maintained by GPFS as the cluster is created, and as nodes are added to or deleted from the cluster.
node quorum: The minimum number of nodes that must be running in order for the daemon to start.
node quorum with tiebreaker disks: A form of quorum that allows GPFS to run with as little as one quorum node available, as long as there is access to a majority of the quorum disks.
non-quorum node: A node in a cluster that is not counted for the purposes of quorum determination.
Non-Volatile Memory Express (NVMe): An interface specification that allows host software to communicate with non-volatile memory storage media.

P

policy: A list of file-placement, service-class, and encryption rules that define characteristics and placement of files. Several policies can be defined within the configuration, but only one policy set is active at one time.
policy rule: A programming statement within a policy that defines a specific action to be performed.
pool: A group of resources with similar characteristics and attributes.
portability: The ability of a programming language to compile successfully on different operating systems without requiring changes to the source code.
primary GPFS cluster configuration server: In a GPFS cluster, the node chosen to maintain the GPFS cluster configuration data.
private IP address: An IP address used to communicate on a private network.
public IP address: An IP address used to communicate on a public network.

Q

quorum node: A node in the cluster that is counted to determine whether a quorum exists.
quota: The amount of disk space and number of inodes assigned as upper limits for a specified user, group of users, or fileset.
quota management: The allocation of disk blocks to the other nodes writing to the file system, and comparison of the allocated space to quota limits at regular intervals.

R

Redundant Array of Independent Disks (RAID): A collection of two or more disk physical drives that present to the host an image of one or more logical disk drives. In the event of a single physical device failure, the data can be read or regenerated from the other disk drives in the array due to data redundancy.
recovery: The process of restoring access to file system data when a failure has occurred. Recovery can involve reconstructing data or providing alternative routing through a different server.
remote key management server (RKM server): A server that is used to store master encryption keys.
replication: The process of maintaining a defined set of data in more than one location. Replication consists of copying designated changes for one location (a source) to another (a target) and synchronizing the data in both locations.
RKM server: See remote key management server.
rule: A list of conditions and actions that are triggered when certain conditions are met. Conditions include attributes about an object (file name, type or extension, dates, owner, and groups), the requesting client, and the container name associated with the object.

S

SAN-attached: Disks that are physically attached to all nodes in the cluster using Serial Storage Architecture (SSA) connections or using Fibre Channel switches.
Scale Out Backup and Restore (SOBAR): A specialized mechanism for data protection against disaster only for GPFS file systems that are managed by IBM Storage Protect for Space Management.
secondary GPFS cluster configuration server: In a GPFS cluster, the node chosen to maintain the GPFS cluster configuration data in the event that the primary GPFS cluster configuration server fails or becomes unavailable.
Secure Hash Algorithm digest (SHA digest): A character string used to identify a GPFS security key.
session failure: The loss of all resources of a data management session due to the failure of the daemon on the session node.
session node: The node on which a data management session was created.
Small Computer System Interface (SCSI): An ANSI-standard electronic interface that allows personal computers to communicate with peripheral hardware, such as disk drives, tape drives, CD-ROM drives, printers, and scanners faster and more flexibly than previous interfaces.
snapshot: An exact copy of changed data in the active files and directories of a file system or fileset at a single point in time. See also fileset snapshot, global snapshot.
source node: The node on which a data management event is generated.
stand-alone client: The node in a one-node cluster.
storage area network (SAN): A dedicated storage network tailored to a specific environment, combining servers, storage products, networking products, software, and services.
storage pool: A grouping of storage space consisting of volumes, logical unit numbers (LUNs), or addresses that share a common set of administrative characteristics.
stripe group: The set of disks comprising the storage assigned to a file system.
striping: A storage process in which information is split into blocks (a fixed amount of data) and the blocks are written to (or read from) a series of disks in parallel.
subblock: The smallest unit of data accessible in an I/O operation, equal to one thirty-second of a data block.
system storage pool: A storage pool containing file system control structures, reserved files, directories, symbolic links, special devices, as well as the metadata associated with regular files, including indirect blocks and extended attributes. The system storage pool can also contain user data.

T

token management: A system for controlling file access in which each application performing a read or write operation is granted some form of access to a specific block of file data. Token management provides data consistency and controls conflicts. Token management has two components: the token management server, and the token management function.
token management function: A component of token management that requests tokens from the token management server. The token management function is located on each cluster node.
token management server: A component of token management that controls tokens relating to the operation of the file system. The token management server is located at the file system manager node.
transparent cloud tiering (TCT): A separately installable add-on feature of IBM Storage Scale that provides a native cloud storage tier. It allows data center administrators to free up on-premise storage capacity, by moving out cooler data to the cloud storage, thereby reducing capital and operational expenditures.
twin-tailed: A disk connected to two nodes.

U

user storage pool: A storage pool containing the blocks of data that make up user files.

V

VFS: See virtual file system.
virtual file system (VFS): A remote file system that has been mounted so that it is accessible to the local user.
virtual node (vnode): The structure that contains information about a file system object in a virtual file system (VFS).

W

watch folder API: Provides a programming interface where a custom C program can be written that incorporates the ability to monitor inode spaces, filesets, or directories for specific user activity-related events within IBM Storage Scale file systems. For more information, a sample program is provided in the following directory on IBM Storage Scale nodes: /usr/lpp/mmfs/samples/util called tswf that can be modified according to the user's needs.