Configuration parameters

This section lists and describes the configuration parameters for HDFS Transparency and gpfs-site.xml.

Configuration options for HDFS Transparency

The following configuration options are provided for HDFS Transparency:
  1. delete option:
    Native HDFS deletes the metadata in the memory using the single threaded mechanism while HDFS Transparency deletes it under IBM StorageĀ® Scale distributed metadata using the same single threaded mechanism. From HDFS Transparency 3.1.0-5 and 3.1.1-2, HDFS Transparency will delete the metadata using the multi-threaded mechanism based on the sub-directories and files. The following parameters can be used to tune the delete operation threads under gpfs-site.xml:
    Table 1. Parameters to tune delete operation threads in gpfs-site.xml
    Configuration options Description
    gpfs.parallel.deletion.max-thread-count Specifies the number of threads used for parallel deletion. Default is 512.
    gpfs.parallel.deletion.per-dir-threshold Specifies the number of entries in a single directory that are handled by a single thread. If this threshold is reached a new thread is started. Default is 10000.
    gpfs.parallel.deletion.sub-dir-threshold Specifies the number of sub-directories (the number of all children, sub-children, sub-sub-children, and so on) that are handled by a single thread. If this threshold is reached a new thread is started. Default is 1000.
  2. du option:
    Native HDFS collects the metadata statistics (for example, disk usage statistics, hdfs dfs -du, or count files and directories, hdfs dfs -count) in the memory using the single threaded mechanism while HDFS Transparency collects the metadata statistics under IBM Storage Scale distributed metadata using the same single threaded mechanism. From HDFS Transparency 3.1.0-6 and 3.1.1-2, HDFS Transparency will collect the metadata statistics using the multi-threaded mechanism based on the sub-directories and files. The following parameters can be used to tune the operation threads under gpfs-site.xml:
    Table 2. Parameters to tune operation threads in gpfs-site.xml
    Configuration options Description
    gpfs.parallel.summary.max-thread-count Specifies the number of threads used for parallel directory summary. Default is 512.
    gpfs.parallel.summary.per-dir-threshold Specifies the number of entries in a single directory that are handled by a single thread. If this threshold is reached a new thread is started. Default is 10000.
    gpfs.parallel.summary.sub-dir-threshold Specifies the number of sub-directories (the number of all children, sub-children, sub-sub-children, and so on) that are handled by a single thread. If this threshold is reached a new thread is started. Defaults is 1000.
  3. list option:
    From HDFS Transparency 3.1.0-6 and 3.1.1-3, the following configuration options for using multiple threads to list a directory and load the metadata of its children are provided:
    Table 3. Configuration options for multi-threaded directory listing in HDFS Transparency
    Configuration options Description
    gpfs.inode.update-thread-count
    HDFS 3.2.2-11 and later
    Specifies the total count of the threads that are used for running statistics on directory entries for list operation. The default value is 8. Therefore, by default, the NameNode will create a thread pool with 8 threads. You can increase this value based on the number of available CPU cores. However, because this is a CPU-bound operation, setting it to more than half of the available cores is not recommended. A new parameter gpfs.inode.stat-thread-count is added to configure the thread pool of operations other than the list.
    HDFS 3.2.2-10 and earlier
    Specifies the total count of the threads that are used for running statistics on directory entries in all operations. The default value is 100. Therefore, by default, the NameNode will create a thread pool with 100 threads. The NameNode uses this thread pool to execute the statistics on directory entries.
    gpfs.inode.max-update-thread-count-per-dir Specifies the max count of the threads that are used to list a single directory. The default value is 8. Therefore, by default, no matter how big the directory is, at most 8 threads will be used to list the directory and load its children.
    gpfs.inode.update-thread-count-factor-per-dir Specifies the count of the children of a directory that are handled by a single directory-listing thread. The default value is 5000. Therefore, by default, if a directory has less than 5000 children, only 1 thread will be used to list the directory and load its children. If the directory has children that are more than or equal to 5000 but less than 10000, two threads will be used to list the directory and load its children, and so on. The total number of threads for a directory cannot exceed the gpfs.inode.max-update-thread-count-per-dir value.
    gpfs.scandir-due-to-lookup-threshold Specifies the threshold that is used to identify a large directory. If the number of children for a directory is greater than the gpfs.inode.max-update-thread-count-per-dir value, it is identified as a large directory. While listing this directory, the NameNode will try to prefetch the metadata of its children to speed up the listing process. The default value is 10000.
    gpfs.parallel.ls.max.invocation.max-thread-count (starting with 3.2.2-7) Specifies the total count of the threads that are used for listing directory contents. The default value is 512. Therefore, by default, the NameNode will create a thread pool with 512 threads and use the thread pool to execute the listing directory contents.
  4. DataNode option:
    From HDFS Transparency 3.1.0-9 and 3.1.1-6, the following configuration options for the DataNode locking mechanism are provided:
    • dfs.datanode.lock.read.write.enabled: If this parameter is set to true, the FsDataset lock will be a read/write lock. If it is set to false, all locks will be write locks. The default value is true.
    • dfs.datanode.lock.fair: If this parameter is set to true, the Datanode FsDataset lock will be used in the Fair mode. This helps to prevent the writer threads from being starved, but can lower the lock throughput. The default value is true.
    • dfs.datanode.lock-reporting-threshold-ms: When thread waits to obtain a lock, or if a thread holds a lock for more than the threshold, a log message will be written. The default value is 300.

Configuration parameters for gpfs-site.xml

Table 4. Configuration parameters for gpfs-site.xml
Key Default value Description Supported versions
gpfs.data.dir Setting this to a subdirectory of the gpfs mount point will make this a subdirectory to the root directory from a hadoop client point of view. Leave it empty to make the whole gpfs file system visible to the hadoop client. When specifying the subdirectory, the gpfs mount point should not be included in the string.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.edit.log.retain.not-modified-for-seconds 2592000 The edit log files that have not been accessed by gpfs.edit.log.retain.not-modified-for-seconds will be automatically deleted by NameNode.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.encryption.enabled false Enable or disable the encryption. It is important to understand the difference between HDFS-level encryption and in-built encryption with IBM Storage Scale. HDFS level encryption is per user based whereas in-built encryption is per node based. Therefore, if the use case demands more fine-grained control at the user level, use HDFS-level encryption. However, if you enable HDFS-level encryption, you will not be able to get in-place analytics benefits such as accessing the same data with HDFS and POSIX/NFS.

This requires Ranger and Ranger KMS. If you plan to enable this, you should enable it on the native HDFS first and confirm it is working before you switch native HDFS into HDFS Transparency.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.fileset.use-global-snapshot false Use global snapshot or fileset snapshot. For more information, see mmcrsnapshot command in IBM Storage Scale: Command and Programming Reference Guide

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.inode.file-per-thread 1000 This parameter tells how many files a thread should handle when using multiple threads to load metadata of the files under a directory. 3.1.0-x
gpfs.inode.getdirent-max-buffer-size 64 * 1024 * 1024 This parameter indicates the size of the buffer used when reading the directory entries (readdir syscall).

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.inode.max-update-thread-count-per-dir 8 Specifies the max count of the threads that are used to list a single directory. The default value is 8. Therefore, by default, no matter how big the directory is, at most 8 threads will be used to list the directory and load its children.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.inode.perf-log-duration-threshold 10000 Specifies the threshold to identify if some log messages should be printed for some time-consuming operations.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.inode.update-thread-count 100
HDFS 3.2.2-11 and later
Specifies the total count of the threads that are used for running statistics on directory entries for list operation. The default value is 8. Therefore, by default, the NameNode will create a thread pool with 8 threads. You can increase this value based on the number of available CPU cores. However, because this is a CPU-bound operation, setting it to more than half of the available cores is not recommended. A new parameter gpfs.inode.stat-thread-count is added to configure the thread pool of operations other than the list.
HDFS 3.2.2-10 and earlier
Specifies the total count of the threads that are used for running statistics on directory entries in all operations. The default value is 100. Therefore, by default, the NameNode will create a thread pool with 100 threads. The NameNode uses this thread pool to execute the statistics on directory entries.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.inode.update-thread-count-factor-per-dir 5000 Specifies the count of the children of a directory that are handled by a single directory-listing thread. The default value is 5000. Therefore, by default, if a directory has less than 5000 children, only 1 thread will be used to list the directory and load its children. If the directory has children that are more than or equal to 5000 but less than 10000, two threads will be used to list the directory and load its children, and so on. The total number of threads for a directory cannot exceed the gpfs.inode.max-update-thread-count-per-dir value.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.inode.stat-thread-count 100 Specifies the total count of the threads that are used for running statistics on directory entries for all operations other than list. The default value is 100. Therefore, by default, the NameNode will create a thread pool with 100 threads and use the thread pool to execute the statistics on directory entries.

3.2.2.11
3.3.6-0

gpfs.dirlisting.cache.inode.count 500000 Specifies the total number of inodes to be cached for list operation. The cache is based on the Least Recently Used (LRU) policy.

3.2.2.11
3.3.6-0

gpfs.path.inode.resolve.cache.size 500 Specifies the total number of inodes to be cached for path resolution and permission checking. The cache is based on the Least Recently Used (LRU) policy.

3.2.2.11
3.3.6-0

gpfs.mnt.dir Specifies the gpfs mount point.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.parallel.deletion.sub-dir-threshold 1000 Specifies the number of sub-directories (the number of all children, sub-children, sub-sub-children, and so on) that are handled by a single thread. If this threshold is reached a new thread is started.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.parallel.deletion.max-thread-count 512 Specifies the number of threads used for parallel deletion.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.parallel.deletion.per-dir-threshold 10000 Specifies the number of entries in a single directory that are handled by a single thread. If this threshold is reached a new thread is started.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.parallel.summary.max-thread-count 512 Specifies the number of threads that are used for parallel directory summary.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.parallel.summary.per-dir-threshold 10000 Specifies the number of entries in a single directory that are handled by a single thread. If this threshold is reached a new thread is started.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.parallel.summary.sub-dir-threshold 1000 Specifies the number of sub-directories (the number of all children, sub-children, sub-sub-children, ...) that are handled by a single thread. If this threshold is reached a new thread is started.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.ranger.enabled scale Specifies the value to enable and disable the ranger support. Set it to true to enable and false to disable. From HDFS Transparency 3.1.0-6 and 3.1.1-3, the Ranger is always supported and this value should be set to scale. This parameter is removed in HDFS Transparency 3.3.x-x.

3.1.0-x
3.1.1-x

gpfs.remotecluster.autorefresh true Enables auto refresh of the GPFS configuration details in HDFS Transparency when the NameNode is started. When set to false, the user must manually run the initmap.sh script.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.replica.enforced gpfs Set this parameter to dfs, if you want to use dfs.replication.

Set this parameter to gpfs, if you want to use the default replication of gpfs. If you set this to gpfs, setReplication API will not take effect anymore and would break the -setrep command of the fs shell.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.scandir-due-to-lookup-threshold 10000 Specifies the threshold that is used to identify a large directory. If the number of children for a directory is greater than the gpfs.inode.max-update-thread-count-per-dir value, it is identified as a large directory. While listing this directory, the NameNode will try to prefetch the metadata of its children to speed up the listing process. The default value is 10000.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.slowop.count.threshold 50000 If an operation object count exceeds this threshold a log entry will be recorded to track the occurrence.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.slowop.duration.threshold 30 If an operation exceeds the threshold in seconds specified, a log entry will be recorded to track the slow operation.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.slowop.message.interval 30 Specifies the rate to log (in seconds) a long running operation.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.ssh.user root Specifies the user for the NameNode to connect to the nodes in the remote mount mode.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.storage.type shared Specifies the type of storage. Set this to shared if you are using shared-storage cluster.

3.1.0-x
3.1.1-x
3.2.2-x
3.3.0-0

gpfs.limit.max.os-uid-gid.lookup false When enabled, limits the OS uid/gid lookups. 3.2.2-7+
Note: The -x under the Supported versions column indicates the latest version.