Configuration parameters
This section lists and describes the configuration parameters for HDFS Transparency and gpfs-site.xml.
Configuration options for HDFS Transparency
- delete option:
Native HDFS deletes the metadata in the memory using the single threaded mechanism while HDFS Transparency deletes it under IBM StorageĀ® Scale distributed metadata using the same single threaded mechanism. From HDFS Transparency 3.1.0-5 and 3.1.1-2, HDFS Transparency will delete the metadata using the multi-threaded mechanism based on the sub-directories and files. The following parameters can be used to tune the delete operation threads under gpfs-site.xml:
Table 1. Parameters to tune delete operation threads in gpfs-site.xml Configuration options Description gpfs.parallel.deletion.max-thread-count Specifies the number of threads used for parallel deletion. Default is 512. gpfs.parallel.deletion.per-dir-threshold Specifies the number of entries in a single directory that are handled by a single thread. If this threshold is reached a new thread is started. Default is 10000. gpfs.parallel.deletion.sub-dir-threshold Specifies the number of sub-directories (the number of all children, sub-children, sub-sub-children, and so on) that are handled by a single thread. If this threshold is reached a new thread is started. Default is 1000. - du option:
Native HDFS collects the metadata statistics (for example, disk usage statistics, hdfs dfs -du, or count files and directories, hdfs dfs -count) in the memory using the single threaded mechanism while HDFS Transparency collects the metadata statistics under IBM Storage Scale distributed metadata using the same single threaded mechanism. From HDFS Transparency 3.1.0-6 and 3.1.1-2, HDFS Transparency will collect the metadata statistics using the multi-threaded mechanism based on the sub-directories and files. The following parameters can be used to tune the operation threads under gpfs-site.xml:
Table 2. Parameters to tune operation threads in gpfs-site.xml Configuration options Description gpfs.parallel.summary.max-thread-count Specifies the number of threads used for parallel directory summary. Default is 512. gpfs.parallel.summary.per-dir-threshold Specifies the number of entries in a single directory that are handled by a single thread. If this threshold is reached a new thread is started. Default is 10000. gpfs.parallel.summary.sub-dir-threshold Specifies the number of sub-directories (the number of all children, sub-children, sub-sub-children, and so on) that are handled by a single thread. If this threshold is reached a new thread is started. Defaults is 1000. - list option:
From HDFS Transparency 3.1.0-6 and 3.1.1-3, the following configuration options for using multiple threads to list a directory and load the metadata of its children are provided:
Table 3. Configuration options for multi-threaded directory listing in HDFS Transparency Configuration options Description gpfs.inode.update-thread-count - HDFS 3.2.2-11 and later
- Specifies the total count of the threads that are used for running statistics on directory entries for list operation. The default value is 8. Therefore, by default, the NameNode will create a thread pool with 8 threads. You can increase this value based on the number of available CPU cores. However, because this is a CPU-bound operation, setting it to more than half of the available cores is not recommended. A new parameter gpfs.inode.stat-thread-count is added to configure the thread pool of operations other than the list.
- HDFS 3.2.2-10 and earlier
- Specifies the total count of the threads that are used for running statistics on directory entries in all operations. The default value is 100. Therefore, by default, the NameNode will create a thread pool with 100 threads. The NameNode uses this thread pool to execute the statistics on directory entries.
gpfs.inode.max-update-thread-count-per-dir Specifies the max count of the threads that are used to list a single directory. The default value is 8. Therefore, by default, no matter how big the directory is, at most 8 threads will be used to list the directory and load its children. gpfs.inode.update-thread-count-factor-per-dir Specifies the count of the children of a directory that are handled by a single directory-listing thread. The default value is 5000. Therefore, by default, if a directory has less than 5000 children, only 1 thread will be used to list the directory and load its children. If the directory has children that are more than or equal to 5000 but less than 10000, two threads will be used to list the directory and load its children, and so on. The total number of threads for a directory cannot exceed the gpfs.inode.max-update-thread-count-per-dir value. gpfs.scandir-due-to-lookup-threshold Specifies the threshold that is used to identify a large directory. If the number of children for a directory is greater than the gpfs.inode.max-update-thread-count-per-dir value, it is identified as a large directory. While listing this directory, the NameNode will try to prefetch the metadata of its children to speed up the listing process. The default value is 10000. gpfs.parallel.ls.max.invocation.max-thread-count (starting with 3.2.2-7) Specifies the total count of the threads that are used for listing directory contents. The default value is 512. Therefore, by default, the NameNode will create a thread pool with 512 threads and use the thread pool to execute the listing directory contents. - DataNode option:
From HDFS Transparency 3.1.0-9 and 3.1.1-6, the following configuration options for the DataNode locking mechanism are provided:
- dfs.datanode.lock.read.write.enabled: If this parameter is set to true, the FsDataset lock will be a read/write lock. If it is set to false, all locks will be write locks. The default value is true.
- dfs.datanode.lock.fair: If this parameter is set to true, the Datanode FsDataset lock will be used in the Fair mode. This helps to prevent the writer threads from being starved, but can lower the lock throughput. The default value is true.
- dfs.datanode.lock-reporting-threshold-ms: When thread waits to obtain a lock, or if a thread holds a lock for more than the threshold, a log message will be written. The default value is 300.
Configuration parameters for gpfs-site.xml
| Key | Default value | Description | Supported versions |
|---|---|---|---|
| gpfs.data.dir | Setting this to a subdirectory of the gpfs mount point will make this a subdirectory to the root directory from a hadoop client point of view. Leave it empty to make the whole gpfs file system visible to the hadoop client. When specifying the subdirectory, the gpfs mount point should not be included in the string. |
3.1.0-x |
|
| gpfs.edit.log.retain.not-modified-for-seconds | 2592000 | The edit log files that have not been accessed by gpfs.edit.log.retain.not-modified-for-seconds will be automatically deleted by NameNode. |
3.1.0-x |
| gpfs.encryption.enabled | false | Enable or disable the encryption. It is important to understand the difference between HDFS-level encryption and in-built encryption with IBM Storage Scale. HDFS level encryption is per user based whereas in-built encryption is per node based. Therefore, if the use case demands more fine-grained control at the user level, use HDFS-level encryption. However, if you enable HDFS-level encryption, you will not be able to get in-place analytics benefits such as accessing the same data with HDFS and POSIX/NFS.
This requires Ranger and Ranger KMS. If you plan to enable this, you should enable it on the native HDFS first and confirm it is working before you switch native HDFS into HDFS Transparency. |
3.1.0-x |
| gpfs.fileset.use-global-snapshot | false | Use global snapshot or fileset snapshot. For more information, see mmcrsnapshot command in IBM Storage Scale: Command and Programming Reference Guide |
3.1.0-x |
| gpfs.inode.file-per-thread | 1000 | This parameter tells how many files a thread should handle when using multiple threads to load metadata of the files under a directory. | 3.1.0-x |
| gpfs.inode.getdirent-max-buffer-size | 64 * 1024 * 1024 | This parameter indicates the size of the buffer used when reading the directory entries (readdir syscall). |
3.1.0-x |
| gpfs.inode.max-update-thread-count-per-dir | 8 | Specifies the max count of the threads that are used to list a single directory. The default value is 8. Therefore, by default, no matter how big the directory is, at most 8 threads will be used to list the directory and load its children. |
3.1.0-x |
| gpfs.inode.perf-log-duration-threshold | 10000 | Specifies the threshold to identify if some log messages should be printed for some time-consuming operations. |
3.1.0-x |
| gpfs.inode.update-thread-count | 100 |
|
3.1.0-x |
| gpfs.inode.update-thread-count-factor-per-dir | 5000 | Specifies the count of the children of a directory that are handled by a single directory-listing thread. The default value is 5000. Therefore, by default, if a directory has less than 5000 children, only 1 thread will be used to list the directory and load its children. If the directory has children that are more than or equal to 5000 but less than 10000, two threads will be used to list the directory and load its children, and so on. The total number of threads for a directory cannot exceed the gpfs.inode.max-update-thread-count-per-dir value. |
3.1.0-x |
| gpfs.inode.stat-thread-count | 100 | Specifies the total count of the threads that are used for running statistics on directory entries for all operations other than list. The default value is 100. Therefore, by default, the NameNode will create a thread pool with 100 threads and use the thread pool to execute the statistics on directory entries. |
3.2.2.11 |
| gpfs.dirlisting.cache.inode.count | 500000 | Specifies the total number of inodes to be cached for list operation. The cache is based on the Least Recently Used (LRU) policy. |
3.2.2.11 |
| gpfs.path.inode.resolve.cache.size | 500 | Specifies the total number of inodes to be cached for path resolution and permission checking. The cache is based on the Least Recently Used (LRU) policy. |
3.2.2.11 |
| gpfs.mnt.dir | Specifies the gpfs mount point. |
3.1.0-x |
|
| gpfs.parallel.deletion.sub-dir-threshold | 1000 | Specifies the number of sub-directories (the number of all children, sub-children, sub-sub-children, and so on) that are handled by a single thread. If this threshold is reached a new thread is started. |
3.1.0-x |
| gpfs.parallel.deletion.max-thread-count | 512 | Specifies the number of threads used for parallel deletion. |
3.1.0-x |
| gpfs.parallel.deletion.per-dir-threshold | 10000 | Specifies the number of entries in a single directory that are handled by a single thread. If this threshold is reached a new thread is started. |
3.1.0-x |
| gpfs.parallel.summary.max-thread-count | 512 | Specifies the number of threads that are used for parallel directory summary. |
3.1.0-x |
| gpfs.parallel.summary.per-dir-threshold | 10000 | Specifies the number of entries in a single directory that are handled by a single thread. If this threshold is reached a new thread is started. |
3.1.0-x |
| gpfs.parallel.summary.sub-dir-threshold | 1000 | Specifies the number of sub-directories (the number of all children, sub-children, sub-sub-children, ...) that are handled by a single thread. If this threshold is reached a new thread is started. |
3.1.0-x |
| gpfs.ranger.enabled | scale | Specifies the value to enable and disable the ranger support. Set it to true to enable and false to disable. From HDFS Transparency 3.1.0-6 and 3.1.1-3, the Ranger is always supported and this value should be set to scale. This parameter is removed in HDFS Transparency 3.3.x-x. |
3.1.0-x |
| gpfs.remotecluster.autorefresh | true | Enables auto refresh of the GPFS configuration details in HDFS Transparency when the NameNode is started. When set to false, the user must manually run the initmap.sh script. |
3.1.0-x |
| gpfs.replica.enforced | gpfs | Set this parameter to dfs, if you want to use dfs.replication.
Set this parameter to gpfs, if you want to use the default replication of gpfs. If you set this to gpfs, setReplication API will not take effect anymore and would break the -setrep command of the fs shell. |
3.1.0-x |
| gpfs.scandir-due-to-lookup-threshold | 10000 | Specifies the threshold that is used to identify a large directory. If the number of children for a directory is greater than the gpfs.inode.max-update-thread-count-per-dir value, it is identified as a large directory. While listing this directory, the NameNode will try to prefetch the metadata of its children to speed up the listing process. The default value is 10000. |
3.1.0-x |
| gpfs.slowop.count.threshold | 50000 | If an operation object count exceeds this threshold a log entry will be recorded to track the occurrence. |
3.1.0-x |
| gpfs.slowop.duration.threshold | 30 | If an operation exceeds the threshold in seconds specified, a log entry will be recorded to track the slow operation. |
3.1.0-x |
| gpfs.slowop.message.interval | 30 | Specifies the rate to log (in seconds) a long running operation. |
3.1.0-x |
| gpfs.ssh.user | root | Specifies the user for the NameNode to connect to the nodes in the remote mount mode. |
3.1.0-x |
| gpfs.storage.type | shared | Specifies the type of storage. Set this to shared if you are using shared-storage cluster. |
3.1.0-x |
| gpfs.limit.max.os-uid-gid.lookup | false | When enabled, limits the OS uid/gid lookups. | 3.2.2-7+ |