Summary of changes

This topic summarizes changes to IBM Storage Scale Big Data and Analytics (BDA) support section.

For information about IBM Storage Scale changes, see the IBM Storage Scale Summary of changes.

For information about BDA feature support, see the List of stabilized, deprecated, and discontinued features section under the Summary of changes.

For information about the resolved IBM Storage Scale APARs, see IBM Storage Scale APARs Resolved.

For information about supported HDFS Transparency versions with IBM Storage Scale, see HDFS Transparency support matrix.

For information about supported Cloudera Data Platform (CDP) versions with IBM Storage Scale, see Support matrix.

Summary of changes as updated, April 2024

Changes in IBM Storage Scale 5.2.0-0
  • Includes HDFS Transparency 3.1.1-17 and HDFS Transparency 3.2.2-8.
Changes in HDFS Transparency 3.2.2-8 in IBM Storage Scale 5.2.0-0
  • Updated several JavaScript files related to the NameNode and DataNode GUI.
  • Fixed multiple issues that occurred while stopping or starting HDFS Transparency roles when the IBM Storage Scale file system was respectively unmounted or remounted.
  • Transparency NameNode redesign to use IBM Storage Scale for direct metadata handling without internal inode caching. This redesign helps to avoid cache synchronization and locking, and reduce NameNode heap usage.
  • Improved NameNode logging to reduce the amount of log messages.
  • Introduced CPU capping for getListing calls.

Summary of changes as updated, February 2024

Changes in IBM Storage Scale 5.1.9-2
  • Includes HDFS Transparency 3.1.1-17 and HDFS Transparency 3.2.2-7.
Changes in HDFS Transparency 3.1.1-17 in IBM Storage Scale 5.1.9-2
  • Updated several JavaScript files related to the NameNode and DataNode GUI.
  • Fixed multiple issues that occurred while stopping or starting HDFS Transparency roles when the IBM Storage Scale file system was respectively unmounted or remounted.
Note: HDFS Transparency 3.2.2-7 supports an upgrade only from HDFS Transparency 3.2.2-5.

Summary of changes as updated, December 2023

Changes in IBM Storage Scale 5.1.9-1
  • Includes HDFS Transparency 3.1.1-16 and HDFS Transparency 3.2.2-7.
Changes in HDFS Transparency 3.1.1-16 in IBM Storage Scale 5.1.9-1
  • Fixed an issue where the reinstallation of the same HDFS Transparency rpm version failed and could not be recovered.
  • Included runLog4jV1Patcher.sh in /usr/lpp/mmfs/hadoop/scripts/ to patch a user provided log4j JAR.
Changes in HDFS Transparency 3.2.2-7 in IBM Storage Scale 5.1.9-1
  • Fixed an issue where the reinstallation of the same HDFS Transparency rpm version failed and could not be recovered.
  • Included runLog4jV1Patcher.sh in /usr/lpp/mmfs/hadoop/scripts/ to patch a log4j JAR provided by a user.
  • Fixed an issue where too many lookups and log entries for missing UID and GID would impact the HDFS Transparency performance.
  • Improved hdfs dfs ls to use IBM Storage Scale ls as input, instead of caching the metadata and synchronizing with IBM Storage Scale regularly.
Changes in the documentation
  • Restructured the "IBM Storage Scale support for Hadoop" chapter.
  • Moved "Hadoop IBM Storage Scale Architecture" to "IBM Storage Scale support for Hadoop" > "Overview".
Note: HDFS Transparency 3.2.2-7 supports an upgrade only from HDFS Transparency 3.2.2-5.

Summary of changes as updated, November 2023

Changes in Cloudera Data Platform Private Cloud Base

  • From IBM Storage Scale 5.1.8.0, CDP Private Cloud Base 7.1.9-CHF1 is certified with IBM Storage Scale on x86 and Power LE. For more information, see Support matrix.
Changes in IBM Storage Scale 5.1.9-0
  • Includes HDFS Transparency 3.1.1-15 and HDFS Transparency 3.2.2-6.
Changes in HDFS Transparency 3.1.1-15 in IBM Storage Scale 5.1.9-0
  • Added the mmhdfs config dump subcommand. For more information, see mmhdfs command.
  • Increased the performance for recursive deletions of snapshot-enabled directories by avoiding the mmlssnapshot dependency.
  • Improved internal data structures to avoid directory lock contentions.
  • Fixed an issue where the log includes many messages like aclutil.cc get_file failed [No such file or directory].
  • Fixed an issue where the getContentSummary returns inconsistent results if multiple files in the same directory are removed at the same time.
  • Added buffered logging and log filtering, which increases HDFS Transparency I/O throughput. For more information, see Buffered logging and filtering.
  • Changed the installation process to use self-provided JAR files. For more information, see Installation prerequisites.
Changes in HDFS Transparency 3.2.2-6 in IBM Storage Scale 5.1.9-0
  • Rebranded scripts in HDFS Transparency 3.2.2.-6 from "IBM Spectrum Scale" to "IBM Storage Scale".
  • Added the mmhdfs config dump subcommand. For more information, see mmhdfs command
  • Increased the performance for recursive deletions of snapshot-enabled directories by avoiding the mmlssnapshot dependency.
  • Improved internal data structures to avoid directory lock contentions.
  • Fixed an issue where the log includes many messages like aclutil.cc get_file failed [No such file or directory].
  • Fixed an issue where the getContentSummary returns inconsistent results if multiple files in the same directory are removed at the same time.
  • Added buffered logging and log filtering, which increases HDFS Transparency I/O throughput. For more information, see Buffered logging and filtering.
  • Changed the installation process to use self-provided JAR files. For more information, see Installation prerequisites.
Note: In upcoming IBM Storage Scale versions, HDFS Transparency 3.2.2-x will be replaced by an HDFS Transparency 3.3.5-x version based on the correspondent Apache Hadoop 3.3.5 version. The Apache Hadoop version used as basis for the HDFS Transparency version will be supported by Apache Bigtop.

Summary of changes as updated, July 2023

Changes in IBM Storage Scale 5.1.8-1
  • Includes HDFS Transparency 3.1.1-14, HDFS Transparency 3.2.2-5, and HDFS Transparency 3.3.0-2.
Changes in HDFS Transparency 3.1.1-14 in IBM Storage Scale 5.1.8-1
  • Rebranding of documentation and scripts in HDFS Transparency 3.1.1-14
  • Adding argument "validity" to gpfs_tls_configuration.py to define the time for which TLS certifications are valid.
  • Previous default of 90 days TLS certification validity was changed to 1826 days if no "validity" argument is passed to gpfs_tls_configuration.py.
  • Improved error handling for malconfigurations in gpfs_tls_configuration.py.
  • Improvement of documentation around TLS certification setup. See TLS

Summary of changes as updated, April 2023

Changes in IBM Storage Scale 5.1.7-1
  • Includes HDFS Transparency 3.1.1-13, HDFS Transparency 3.2.2-5, and HDFS Transparency 3.3.0-2.
Changes in HDFS Transparency 3.1.1-13 in IBM Storage Scale 5.1.7-1
  • Fixed an issue where appending to an existing file in an encryption zone failed (APAR IJ45843).
  • Improved parallel data access by reducing the locking scope on directory-level to avoid parent directory locking.
  • Fixed an issue where the rm and du commands would fail with NoSuchFileException.
  • Reduced exception to warning when a file lease cannot be found while creating a file, in order to prevent application-side failures.
  • Improved overall performance by changing the update process for NameNode metadata and reducing the syncChildren calls.
  • Fixed an issue where the NameNode crashes by failing to finalize the shared edit log on NameNode failover.
  • Improved the listing performance by changing the way stat is called and avoiding stat oscillation behavior.
  • Changed the RSA Key strength in the TLS enablement script from 1024 to 2048.
  • Added an AccessControlException in the put command if used for deleted users.
  • Fixed an issue where the TLS script fails with enable-tls option if the dfs.namenode.http-address parameter is missing in the configuration.
  • Realigned the usage text of hdfs getconf.
  • Fixed an issue where the NameNode will not start because of a missing dependent jar. For resolution in the affected HDFS Transparency versions 3.1.1-11, 3.1.1-12, 3.2.2-2 and 3.2.2-3, see NameNode fails to start in HDFS Transparency 3.1.1-11, 3.1.1-12, 3.2.2-2 or 3.2.2-3.
Changes in HDFS Transparency 3.2.2.-5 in IBM Storage Scale 5.1.7-1
  • Fixed an issue where parallel move or rename and listing operations on the same directory can lead to a deadlock situation.

Summary of changes as updated, March 2023

Changes in IBM Storage Scale 5.1.7-0
  • Includes HDFS Transparency 3.1.1-12, HDFS Transparency 3.2.2-4, and HDFS Transparency 3.3.0-2.
Changes in HDFS Transparency 3.2.2-4 in IBM Storage Scale 5.1.7-0
  • Fixed an issue where gpfs_kerberos_configuration.py fails to run.
  • Improved parallel data access by reducing the locking scope on directory level to avoid parent directory locking.
  • Fixed an issue where the rm and du commands fail with NoSuchFileException.
  • Reduced exception to warning when a file lease cannot be found while creating a file, in order to prevent application side failures.
  • Improved overall performance by changing the update process for NameNode metadata and reducing syncChildren calls.
  • Fixed an issue where the NameNode crashes by failing to finalize the shared edit log on NameNode failover.
  • Improved the listing performance by changing the way stat is called and avoiding stat oscillation behavior.
  • Fixed an issue where the TLS script fails with enable-tls option if the dfs.namenode.http-address parameter is missing in the configuration.
  • Changed TLS encryption value to 2048.
  • Fixed an issue where the NameNode will not start because of a missing dependent jar. For resolution in the affected HDFS Transparency versions 3.1.1-11, 3.1.1-12, 3.2.2-2 and 3.2.2-3, see NameNode fails to start in HDFS Transparency 3.1.1-11, 3.1.1-12, 3.2.2-2 or 3.2.2-3.

Summary of changes as updated, January 2023

Changes in IBM Storage Scale 5.1.6-1
  • Includes HDFS Transparency 3.1.1-12, HDFS Transparency 3.2.2-3 and HDFS Transparency 3.3.0-2.
Changes in IBM Storage Scale 5.1.2-9
  • Includes HDFS Transparency 3.1.1-12 and HDFS Transparency 3.3.0-2.
Changes in HDFS Transparency 3.1.1-12 in IBM Storage Scale 5.1.2-9 and IBM Storage Scale 5.1.6-1
  • Added security fix for CVE-2022-25168.

Summary of changes as updated, December 2022

Changes in IBM Storage Scale 5.1.6-0
  • Includes HDFS Transparency 3.1.1-11, HDFS Transparency 3.2.2-3, and HDFS Transparency 3.3.0-2.
Changes in HDFS Transparency 3.1.1-11 in IBM Storage Scale 5.1.6-0
  • Fixed the issue where a ticket expiration in an AD Kerberos environment can lead to two active NameNodes.
  • Included fine-grained read/write locking of file lease manager to improve the performance.
  • Fixed the issue where mmhdfs config import ignored ranger-hdfs-policymgr-ssl.xml.
  • Added general security fixes.
Changes in HDFS Transparency 3.2.2-3 in IBM Storage Scale 5.1.6-0
  • Added general security fixes.
  • Added a fix for the scripts in /usr/lpp/mmfs/hadoop/scripts/ to run with Python 3.8.
Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.8-0 in IBM Storage Scale 5.1.6-0
  • Fixed an issue that lets the IBM Storage Scale Install Toolkit fail if the file system configured for HDFS Transparency includes an underscore.
Changes in Cloudera Data Platform Private Cloud Base
  • From IBM Storage Scale 5.1.4.0, CDP Private Cloud Base 7.1.8 is certified with IBM Storage Scale on Power. For more information, see Support matrix.

    For Hue to work properly, Cloudera Manager 7.7.1+ requires Python version to be at v3.8 on the Hue nodes.

Summary of changes as updated, October 2022

Changes in IBM Storage Scale 5.1.5-1
  • Includes HDFS Transparency 3.1.1-10, HDFS Transparency 3.2.2-2 and HDFS Transparency 3.3.0-2.
Changes in HDFS Transparency 3.2.2-2 in IBM Storage Scale 5.1.5-1
  • Fixed the issue where a ticket expiration in an AD Kerberos environment can lead to two active NameNodes.
  • Included fine-grained read/write locking of file lease manager to improve the performance.
  • Added general security fixes.
  • Added security fix for CVE-2022-25168.
Changes in the documentation

Summary of changes as updated, September 2022

Changes in IBM Storage Scale 5.1.5
  • Includes HDFS Transparency 3.1.1-10, HDFS Transparency 3.2.2-1 and HDFS Transparency 3.3.0-2.
Changes in Cloudera Data Platform Private Cloud Base
  • From IBM Storage Scale 5.1.4.0, CDP Private Cloud Base 7.1.8 is certified with IBM Storage Scale on x86. For more information, see Support matrix.

Summary of changes as updated, August 2022

Changes in IBM Storage Scale 5.1.2.6
  • Includes HDFS Transparency 3.1.1-10 and HDFS Transparency 3.3.0-2.
    Note: IBM Storage Scale 5.1.3.0, IBM Storage Scale 5.1.3.1 and IBM Storage Scale 5.1.4.0 include earlier versions of HDFS Transparency and an upgrade must be considered to IBM Storage Scale 5.1.4.1 or later.

    Added support for Red Hat IPA Kerberos for HDFS Transparency.

Summary of changes as updated, July 2022

Changes in HDFS Transparency 3.1.1-10 in IBM Storage Scale 5.1.4.1
  • Fixed the issue where a fast repetitive usage of mmces service stop hdfs and mmces service start hdfs can lead to two standby NameNodes.
  • Added security fix for CVE-2022-23305, CVE-2022-23307, CVE-2022-23302 and CVE-2020-9488.
Changes in HDFS Transparency 3.3.0-2 in IBM Storage Scale 5.1.4.1
  • Added security fix for CVE-2022-23305, CVE-2022-23307, CVE-2022-23302, CVE-2020-9488.

Summary of changes as updated, June 2022

Changes in HDFS Transparency 3.2.2-1 in IBM Storage Scale 5.1.4.0
  • Supports CES HDFS Transparency 3.2.2-1 for Open Source Apache Hadoop 3.2.2 distribution on RH 7.9 on x86_64.
Changes in HDFS Transparency 3.1.1-9 in IBM Storage Scale 5.1.4.0
  • Optimized the internal metadata data structures for the NameNode for improved memory efficiency. For more information, see Recommended hardware resource configuration.
  • Fixed the parsing problem of hadoop-env.sh that used to skip the last line and therefore might miss configuration key-value pairs on the last line of the file.

Summary of changes as updated, May 2022

Changes in HDFS Transparency 3.2.2-0 in IBM Storage Scale 5.1.3.2
  • IBM Storage Scale 5.1.3 PTF2 is a technology preview version specifically for Hadoop users who want to try out HDFS Transparency 3.2.2 for Open-source Apache Hadoop 3.2.2 during a limited download period in the Fix Central. This technology preview is only available for Data Management Edition on RHEL 7.9 on x86_64 with a limited-time period for nonproduction usage. IBM Storage Scale 5.1.3 PTF2 contains the additional HDFS Transparency 3.2.2 with the IBM Storage Scale 5.1.3 PTF1 content. Therefore, this technology preview cannot be installed if IBM Storage Scale 5.1.3 PTF1 is already installed.

Summary of changes as updated, April 2022

Changes in Cloudera Data Platform Private Cloud Base
  • CDP Private Cloud Base 7.1.7 SP1 is certified with IBM Storage Scale starting from version 5.1.2.2. For more information, see Support matrix.

Summary of changes as updated, March 2022

Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.5.0 in IBM Storage Scale 5.1.3
  • Supports the parallel offline upgrade.

    The parallel offline upgrade support will change the current offline upgrade process from sequential to parallel. This will significantly reduce the upgrade time in the offline mode.

Changes in IBM Storage Scale file system core configuration in IBM Storage Scale 5.1.3
  • For updates to the tscCmdAllowRemoteConnections parameter, see the File system core improvements section under the IBM Storage Scale Summary of changes documentation.

Summary of changes as updated, January 2022

Changes in HDFS Transparency 3.1.1-8 in IBM Storage Scale 5.0.5.12

Changes in HDFS Transparency 3.1.1-8 and 3.3.0-1 in IBM Storage Scale 5.1.2.2
  • Added security fix for CVE-2021-4104 and CVE-2019-17571.
Changes in HDFS Transparency 3.1.0-10 in IBM Fix Central
  • Added security fix for CVE-2021-4104 and CVE-2019-17571.
  • Fixed the timing rename failures.

    Note that HDFS Transparency 3.1.0-10 is the last PTF in the 3.1.0.x stream.

For more information, see IBM Security Bulletin.

Summary of changes as updated, December 2021

Changes in HDFS Transparency 3.1.0-9
  • Optimized the handling of the metadata for NameNode for improved memory efficiency.
    To ensure that the data on IBM Storage Scale that is to be processed with HDFS Transparency is up to date, the IBM Storage Scale mount option mtime -E: YES (default value) must be set to always return the accurate file modification times.
  • Optimized parallelism for DataNode request processing for the performance improvement. This includes the ports of HDFS-15150 and HDFS-15160 that introduces three DataNode configuration parameters. For more information, see Configuration options for HDFS Transparency.
  • The IBM Storage Scale file system is now explicitly checked in mount and unmount callbacks during HDFS Transparency startup and shutdown. Unrelated IBM Storage Scale file systems no longer affect HDFS Transparency. This means that HDFS Transparency will start only if the relevant mount point is properly mounted and will stop if the relevant mount point is unmounted based on the HDFS Transparency status checking in the IBM Storage Scale event callback process.
  • Fixed intermittent issues in date and size output when listing files.

Summary of changes as updated, November 2021

Changes in HDFS Transparency 3.1.1-7 in IBM Storage Scale 5.1.2.1
  • Support added for Java 11.

Summary of changes as updated, October 2021

Changes in Mpack version 2.7.0.10
  • The IBM Storage Scale service can now be deployed or upgraded in a single or multiple HDFS namespace configuration. This includes adding DataNode using Ambari in multiple HDFS namespaces.
  • Decommissioning DataNodes using the Ambari HDFS service is now supported.
  • Fixed NamenodeHAState init arguments after 1 retry failure during HDP upgrading with Ambari 2.7.5.17-6 and Mpack 2.7.0.9 at the HDFS service upgrade step.
  • The IBM Storage Scale service can now be deployed in Ambari in remote cluster mount configuration for non-root Ambari and IBM Storage Scale environment.
  • The MoveNameNodeTransparency.py script now supports moving the HDFS Transparency NameNode when Kerberos is enabled.
Changes in Cloudera Data Platform Private Cloud Base
  • CDP Private Cloud Base 7.1.7 is certified with IBM Storage Scale from version 5.1.1.2 on Power LE platform. For more information, see Support matrix.
Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.4-0 in IBM Storage Scale 5.1.2
  • The cleanup -n option of installation toolkit will now clear only the configuration of a single HDFS cluster instead of clearing the configurations of all the HDFS clusters in a multi-HDFS cluster environment from the toolkit's metadata.
Changes in HDFS Transparency 3.1.1-6
  • Optimized the handling of the metadata for the NameNode performance improvement.
  • Optimized parallelism for DataNode request processing for performance improvement. This includes ports of HDFS-15150 and HDFS-15160 that introduces three DataNode configuration parameters. For more information, see Configuration options for HDFS Transparency.
  • Fixed getListing RPC to handle the remaining files correctly when the block locations are requested that would cause higher-level services to get an incomplete directory listing.
  • Support for decommissioned DataNodes is enabled. For more information, see Decommissioning DataNodes.
  • Fixed metadata handling when a listing would not show the correct creation time.
Documentation update

Summary of changes as updated, August 2021

Changes in Cloudera Data Platform Private Cloud Base
  • CDP Private Cloud Base 7.1.7 is certified with IBM Storage Scale 5.1.1.2 on x86_64 platform.
  • CDP 7.1.7 supports the upgrade path from CDP 7.1.6 with CSD 1.1.0-0 on IBM Storage Scale 5.1.1.1 to CDP 7.1.7 with CSD 1.2.0-0 on IBM Storage Scale 5.1.1.2. For more information, see Upgrading CDP.

Summary of changes as updated, July 2021

Changes in HDFS Transparency 3.3.0-0 in IBM Storage Scale 5.1.1.2
  • Supports CES HDFS Transparency 3.3 for Open Source Apache Hadoop 3.3 distribution on RH 7.9 on x86_64.
Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.3-2 in IBM Storage Scale 5.1.1.2
  • Supports new installation of CES HDFS Transparency 3.3 through the IBM Storage Scale installation toolkit on RH 7.9 on x864_64 when the environment variable SCALE_HDFS_TRANSPARENCY_VERSION_33_ENABLE=True is exported. For more information, see Steps for install toolkit.
Changes in HDFS Transparency 3.1.0-8
  • Optimized the handling of the metadata for the NameNode performance improvement.
  • Fixed getListing RPC to handle the remaining files correctly when block locations are requested that would cause higher-level services to get an incomplete directory listing.
  • Backported the fix for a race condition that caused parsing error of java.io.BufferedInputStream in org.apache.hadoop.conf.Configuration class (HADOOP-15331).
  • Fixed the handling of the file listing so that the java.nio.file.NoSuchFileException warning messages do not occur.
  • Fixed the handling of getBlockLocation RPC on the files that do not exist. This prevented the YARN ResourceManager to start after configuring node labels directory.
  • Support for decommissioned DataNodes is enabled. For more information, see Decommissioning DataNodes.
  • General security fixes and CVE-2020-9492 in IBM Support.
Changes in Cloudera HDP
  • The --sync-hdp option used for upgrading HDP is now deprecated.

Summary of changes as updated, June 2021

Changes in Cloudera Data Platform Private Cloud Base
  • CDP Private Cloud Base 7.1.6 is now certified on ppc64le.
Changes in HDFS Transparency 3.1.1-5 in IBM Storage Scale 5.1.1.1
  • Fixed the handling of the file listing. Therefore, the java.nio.file.NoSuchFileException warning messages will no longer occur.
  • Fixed the handling of getBlockLocation RPC on files that do not exist. This prevented the YARN ResourceManager to start after configuring the node labels directory.
  • From HDFS Transparency 3.1.1-5, the gpfs_tls_configuration.py script automates the configuration of Transport Layer Security (TLS) on the CES HDFS Transparency cluster.
Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.3.1 in IBM Storage Scale 5.1.1.1
  • From Toolkit version 1.0.3.1, creating multiple CES HDFS clusters using the IBM Storage Scale installation toolkit during the same deployment run is supported.

Summary of changes as updated, May 2021

Changes in Cloudera Data Platform Private Cloud Base
  • From CDP Private Cloud Base 7.1.6, Impala is certified on IBM Storage Scale 5.1.1 on x86_64.

Summary of changes as updated, April 2021

Changes in Cloudera Data Platform Private Cloud Base
  • CDP Private Cloud Base 7.1.6 is certified with IBM Storage Scale 5.1.1.0. This CDP Private Cloud Base version supports Transport Layer Security (TLS) and HDFS encryption.
Changes in HDFS Transparency 3.1.1-4
  • Fixed the mmhdfs command to recognize short hostname configuration for NameNodes and Data Nodes. Therefore, The node is not a namenode or datanode error message will no longer occur.
  • The IBM Storage Scale file systems are now explicitly checked in mount and unmount callbacks during HDFS Transparency startup and shutdown process. Unrelated IBM Storage Scale file systems no longer affect HDFS Transparency. This means that HDFS Transparency will start only if the relevant mount point is properly mounted and will stop if the relevant mount point is unmounted based on the HDFS Transparency status checking in the IBM Storage Scale event callback process.
  • HDFS Transparency NameNode log now contains the HDFS Transparency full version information and the gpfs.encryption.enable value.
  • Added general security fixes and CVE-2020-4851 in IBM® Support.
  • Added a new custom json file method for the Kerberos script. For more information, see Configuring Kerberos using the Kerberos script provided with IBM Storage Scale.
Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.3.0

Summary of changes as updated, March 2021

Changes in IBM Storage Scale CES HDFS Transparency
  • IBM Storage Scale CES HDFS Transparency now supports both the NameNode HA and non-HA options. Also, DataNode can now have Hadoop services colocated within the same node. For more information, see Alternative architectures.
Changes in Mpack version 2.7.0.9
  • The Ambari maintenance mode for clusters is now supported by the IBM Storage Scale service on gpfs.storage.type with shared or remote environments. Earlier, when the user performed a Start all or Stop all operation from the Ambari GUI, the IBM Storage Scale service or its components that are used to start or stop respectively even when they were set to maintenance mode.
  • The Mpack upgrade process does not reinitialize the following HDFS parameters to the Mpack’s recommended settings:
    • dfs.client.read.shortcircuit
    • dfs.datanode.hdfs-blocks-metadata.enabled
    • dfs.ls.limit
    • dfs.datanode.handler.count
    • dfs.namenode.handler.count
    • dfs.datanode.max.transfer.threads
    • dfs.replication
    • dfs.namenode.shared.edits.dir

    Earlier any updates to these parameters by the end user were overwritten. As this issue is now fixed, any customized hdfs-site.xml configuration will not be changed during the upgrade process.

  • In addition to Check Integration Status option in the Ambari service, you can now view the Mpack version/build information in version.txt in the Mpack tar.gz package.
  • The hover message for the GPFS Quorum Nodes text field within the IBM Storage Scale service GUI has been updated. The hostnames to be entered for the Quorum Nodes should be from the IBM Storage Scale Admin network hostnames.
  • The Mpack uninstaller script cleans up the IBM Storage Scale Ambari stale link that is no longer required. Therefore, the Ambari server restart will not fail because of the Mpack dependencies.
  • The Mpack installation, upgrade, and uninstall script now supports the sudo root permission.
  • The anonymous UID verification is checked only if hadoop.security.authentication is not set to Kerberos.
  • The IBM Storage Scale service can now monitor the status of configured file system mount point (gpfs.mnt.dir).
    In earlier releases of Mpack, the IBM Storage Scale service was able to monitor only the status of the IBM Storage Scale runtime daemon.
    If any of the configured file system is not mounted on the IBM Storage Scale node, the status for the GPFS_NODE component for that node will now appear as down in the Ambari GUI.

Summary of changes as updated, January 2021

Changes in Cloudera Data Platform Private Cloud Base

Cloudera Data Platform Private Cloud Base with IBM Storage Scale is supported on Power®. For more information, see Support matrix.

Changes in HDFS Transparency 3.1.0-7
  • Fixed the NullPointerException error message that appeared in the NameNode logs.
  • Fixed the JMX output to correctly report "open" operations when the gpfs.ranger.enabled parameter is set to scale.
  • A vulnerability in IBM Storage Scale allows injecting malicious content into the log files. For the security fix information, see IBM Support.

Documentation update

Configuration options for using multiple threads to list a directory and load the metadata of its children are provided for HDFS Transparency 3.1.1-3 and 3.1.0-6. For more information, see the list option.

Summary of changes as updated, December 2020

Changes in HDFS Transparency 3.1.1-3
  • HDFS Transparency implements performance enhancement by using fine-grained file system locking mechanism. After HDFS Transparency 3.1.1-3 is installed, ensure that the gpfs.ranger.enabled field is set to scale in /var/mmfs/hadoop/etc/hadoop/gpfs-site.xml. For more information, see Setting configuration options in CES HDFS.
  • The create Hadoop users and groups script and the create Kerberos principals and keytabs script in IBM Storage Scale now reside in the /usr/lpp/mmfs/hadoop/scripts directory.
  • Requires Python 3.6 or later.
Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.2-1
  • The toolkit installation failure due to nodes that are not a part of the CES HDFS cluster and does not have JAVA installed and do not have JAVA_HOME set is now fixed.
  • The following proxyuser configurations were added into core-site.xml by the installation toolkit to configure a CES HDFS cluster:
    hadoop.proxyuser.livy.hosts=*
    hadoop.proxyuser.livy.groups=*
    hadoop.proxyuser.hive.hosts=*
    hadoop.proxyuser.hive.groups=*
    hadoop.proxyuser.oozie.hosts=*
    hadoop.proxyuser.oozie.groups=*
Changes in IBM Storage Scale Cloudera Custom Service Descriptor (CDP CSD) 1.0.0-0
  • Integrates IBM Storage Scale service into CDP Private Cloud Base Cloudera Manager.

Summary of changes as updated, November 2020

Changes in HDFS Transparency 3.1.1-2
  • Start of changeSupports CDP Private Cloud Base. For more information, see Support matrix.End of change
  • Includes Hadoop sample scripts to create users and groups in IBM Storage Scale and set up the Kerberos principals and keytabs. Requires Python 3.6 or later.
  • Summary operations (for example, du, count, and so on) in HDFS Transparency can be now done multi-threaded based on the number of files and subdirectories. It improves the performance when performing the operation on a path that contains numerous files and subdirectories. The performance improvement depends on the system environment. For more information, see Functional limitations.
Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.2-0
  • Added support to deploy CES HDFS in SLES 15 and Ubuntu 20.04 on x86_64 platforms.
  • Package was renamed from bda_integration-<version>.noarch.rpm to gpfs.bda-integration-<version>.noarch.rpm .
  • Requires Python 3.6 or later.
Start of changeChanges in IBM Storage Scale Cloudera Custom Service Descriptor (CDP CSD) 1.0.0-0 EA
  • Integrates IBM Storage Scale service into CDP Private Cloud Base Cloudera Manager.
End of change

Summary of changes as updated, October 2020

Changes in HDFS Transparency 3.1.0-6
  • HDFS Transparency now implements performance enhancement by using the fine-grained file system locking mechanism instead of using the Apache Hadoop global file system locking mechanism. From HDFS Transparency 3.1.0-6, set gpfs.ranger.enabled to scale from the HDP Ambari GUI under the IBM Storage Scale service configuration page. If you are not using Ambari, set gpfs.ranger.enabled in /var/mmfs/hadoop/etc/hadoop/gpfs-site.xml as follows:
    <property>
    <name>gpfs.ranger.enabled</name>
    <value>scale</value>
    <final>false</final>
    </property>
    Note: The scale option replaces the original true/false values.
  • Summary operations (for example, du, count, and so on) in HDFS Transparency can be now done multi-threaded based on the number of files and subdirectories. It improves the performance when performing the operation on a path that contains numerous files and subdirectories. The performance improvement depends on the system environment. For more information, see Functional limitations.

Summary of changes as updated, August 2020

Changes in Mpack version 2.7.0.8

For Mpack 2.7.0.7 and earlier, a restart of the IBM Storage Scale service would overwrite the IBM Storage Scale customized configuration if the gpfs.storage.type parameter was set to shared.

From Mpack 2.7.0.8, if the gpfs.storage.type parameter is set to shared or shared,shared, the IBM Storage Scale service will not set the IBM Storage Scale tunables, that are seen under the IBM Storage Scale service, back to the IBM Storage Scale cluster or file system.

Summary of changes as updated, July 2020

Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.1.1
  • Supports rolling upgrade of HDFS Transparency through installation toolkit.
    Note: If the SMB protocol is enabled, all protocols are required to be offline for some time because the SMB does not support the rolling upgrade.
  • Requires IBM Storage Scale 5.0.5.1 and HDFS Transparency 3.1.1-1. For more information, see CES HDFS HDFS Transparency support matrix.
  • From IBM Storage Scale 5.0.5.1, only one CES-IP is needed for one HDFS cluster during installation toolkit deployment.
Changes in HDFS Transparency 3.1.0-5
  • When gpfs.replica.enforced is set to gpfs, client replica setting is not honored. Convert the WARN namenode.GPFSFs (GPFSFs.java:setReplication(123)) - Set replication operation invalid when gpfs.replica.enforced is set to gpfs message to Debug, because this message can occur many times in the NameNode log.
  • Fixed NameNode hangs when you are running the mapreduce jobs because of the lock synchronized issue.
  • From IBM Storage Scale 5.0.5, the gpfs.snap --hadoop can access the HDFS Transparency logs from the user configured directories.
  • From HDFS Transparency 3.1.0-5, the default value for dfs.replication is 3 and gpfs.replica.enforced is gpfs. Therefore, it uses the IBM Storage Scale file system replication and not the Hadoop HDFS replication. Also, increasing the dfs.replication value to 3 helps the hdfs client to tolerate the DataNode failures.
    Note: You need to have at least three DataNodes when you set the dfs.replication to 3.
  • Changed permission mode for editlog files to 640.
  • For two file systems, HDFS Transparency ensures that the NameNodes and DataNodes are stopped before unmounting the second file system mount point.
    Note: The local directory path for the second file system mount usage is not removed. Ensure this local directory path is empty before starting the NameNode.
  • HDFS Transparency does not manage the storage. Therefore, the Apache Hadoop block function call used for native HDFS gives a false metric information. Therefore, HDFS Transparency does not run the Apache Hadoop block function calls.
  • Delete operations in HDFS Transparency can be now done multi-threaded based on the number of files and subdirectories. It improves performance when deleting a path that contains numerous files and subdirectories. The performance improvement depends on the system environment. For more information, see Functional limitations.
Changes in Mpack version 2.7.0.7
  • Supports HDP upgrade with Mpack 2.7.0.7 without unintegrating HDFS Transparency. .
  • The Mpack 2.7.0.7 supports Ambari version 2.7.4 or later.
  • The installation and upgrade scripts now support complex KDC password when Kerberos is enabled.
  • You can now upgrade from older Mpacks (versions 2.7.0.x) to Mpack 2.7.0.7 if Kerberos is enabled without using the workaround .
  • The upgrade postEU process is now simplified and can now automatically accept the user agreement license.
  • The upgrade postEU option now requests the user inputs only once during the upgrade process.
  • During the Mpack installation or upgrade process, the backup directory that is created by the Mpack installer now includes a date timestamp added to the directory name.
  • The Check Integration Status UI action in IBM Storage Scale service now shows the unique Mpack build ID.
  • If you are enabling Kerberos after integrating IBM Storage Scale service, ZKFC initialization used to fail because the hdfs_jaas.conf file was missing. A workaround is no longer required.
  • Ambari now supports rolling restart for NameNodes and DataNodes.
  • The configuration changes will be in effect after you restart the NameNodes and DataNodes and do not require all the HDFS Transparency nodes to be restarted.
  • If the SSL is enabled, the upgrade script asks for the hostname instead of the IP address.
  • The upgrade script requesting true/false inputs are no longer case sensitive.
  • When deployment type is set to gpfs.storage.type=shared, a local GPFS cluster would be created even if the bidirectional passwordless ssh was not set up properly between the GPFS Master and the ESS contact node. This issue is now fixed. The deployment fails in such scenarios and an error message is displayed.
  • If you are using IBM Storage Scale 4.2.3.2, Ambari service hangs because the mmchconfig would be prompting for an ENTER feedback for the LogFileSize parameter. From Mpack 2.7.0.7, the LogFileSize configuration cannot be modified. The LogFileSize parameter can be configured only through the command line by using the mmchconfig command.

Summary of changes as updated, May 2020

Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.1.0
  • Supports offline upgrade of HDFS Transparency.
  • Requires IBM Storage Scale 5.0.5 and HDFS Transparency 3.1.1-1. For more information, see CES HDFS HDFS Transparency support matrix.

Changes in HDFS Transparency 3.1.1-1

  • A check is performed while you are running the mmhdfs config upload command to ensure that the ces_group_name is consistent with the HDFS Transparency dfs.nameservices values.
  • From IBM Storage Scale 5.0.5, the gpfs.snap --hadoop can now access the HDFS Transparency logs from the user-configured directories.
  • From HDFS Transparency 3.1.1-1, the default value for dfs.replication is 3 and gpfs.replica.enforced is gpfs. Therefore, it uses the IBM Storage Scale file system replication and not the Hadoop HDFS replication. Also, increasing the dfs.replication value to 3 helps the hdfs client to tolerate the DataNode failures.
    Note: You need to have at least three DataNodes when you set the dfs.replication to 3.
  • Fixed NameNode hangs when you are running the mapreduce jobs because of the lock synchronized issue.
CES HDFS changes
  • From IBM Storage Scale 5.0.5, HDFS Transparency version 3.1.1-1 and Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) version 1.0.1.0, HDFS Transparency and Toolkit for HDFS packages are signed with a GPG (GNU Privacy Guard) key and can be deployed by the IBM Storage Scale installation toolkit.
    For more information, go to IBM Storage Scale documentation and see the following topics:
    • Installation toolkit changes subsection under the Summary of changes topic.
    • Limitations of the installation toolkit topic under the Installing > Installing IBM Spectrum Scale on Linux nodes and deploying protocols > Installing IBM Spectrum Scale on Linux nodes with the installation toolkit.

Summary of changes as updated, March 2020

Changes in IBM Storage Scale Big Data Analytics Integration Toolkit for HDFS Transparency (Toolkit for HDFS) 1.0.0.1
  • Supports deployment on ESS.
  • Supports remote mount file system only for CES HDFS protocol.
  • Requires IBM Storage Scale 5.0.4.3 and HDFS Transparency 3.1.1-0. For more information, see CES HDFS HDFS Transparency support matrix.

Summary of changes as updated, January 2020

Changes in HDFS Transparency 3.1.1-0
  • Integrates with CES protocol and IBM Storage Scale installation toolkit.
  • Supports Open Source Apache Hadoop distribution and Red Hat Enterprise Linux® operating systems.
Changes in HDFS Transparency 3.1.0-4
  • Export NODE_HDFS_MAP_GPFS commented line into hadoop-env.sh file for mmhadoopctl multi-network usage.
  • Fixed data replicate with AFM DR disk usage due to shrinkfit.
  • Fixed Job will not fail if one DataNode failed when using gpfs.replica, enforced=gpfs, gpfs.storage.type and dfs.replication > 1 in shared mode.
  • Change to log warning messages for outdated clusterinfo and diskinfo files.
  • Fixed deleting a file issue on the 2nd file system when trash is enabled in a two file system configuration.
  • Use the default community-defined port number for dfs.datanode (address, ipc.address, and http.address) to reduce port conflicts with ephemeral ports.
  • Fixed hadoop df output that was earlier not consistent with the POSIX df output when 2 FS is configured.
  • Fixed dfs -du that was earlier displaying wrong free space value.
Changes in Mpack version 2.7.0.6
  • Supports HDP 3.1.5.

Summary of changes as updated, November 2019

Changes in Mpack version 2.7.0.5
  • The Mpack Installation script SpectrumScaleMPackInstaller.py will no longer ask for the KDC credentials, even when the HDP Hadoop cluster is Kerberos enabled. The KDC credentials are only required to be setup before executing the IBM Storage Scale service Action "Unintegrated Transparency".
  • If you are deploying the IBM Storage Scale service in a shared storage configuration (gpfs.storage.type=shared), the Mpack will check for consistency of UID, GID of the anonymous user only on the local GPFS nodes. The Mpack will not perform this check on the ESS nodes.
  • If you are deploying the IBM Storage Scale service with two file system support with gpfs.storage.type=shared,shared or gpfs.storage.type=remote,remote, then the Block Replication in HDFS (dfs.blocksize) will default to 1.
  • From Mpack 2.7.0.5, the issue of having all the nodes managed by Ambari to be set as GPFS nodes during deployment is fixed. For example, if you set some nodes as Hadoop client nodes and some nodes as GPFS nodes for HDFS Transparency NameNode and DataNodes, the deployment will succeed.
  • In Mpack 2.7.0.4, if the gpfs.storage.type was set to shared, stopping the Scale service from Ambari would report a failure in the UI even if the operation had succeeded internally. This issue has been fixed in Mpack 2.7.0.5.
  • IBM Storage Scale Ambari deployment can now support gpfs.storage.type=shared,shared mode.

Summary of changes as updated, October 2019

IBM Erasure Code Edition (ECE) is supported as shared storage mode for Hadoop with HDFS Transparency 3.1.0-3 and IBM Storage Scale 5.0.3.

Summary of changes as updated, September 2019

Changes in HDFS Transparency 3.1.0-3
  • Validate open file limit when starting Transparency.
  • mmhadoopctl supports dual network configuration when NODE_HDFS_MAP_GPFS is set in /var/mmfs/hadoop/etc/hadoop/hadoop-env.sh. See section mmhadoopctl supports dual network for more details.
Changes in Mpack version 2.7.0.4
  • For FPO clusters, the restripeOnDiskFailure value will be set to NO regardless of the original set value during the stopping of GPFS main components. After the GPFS main stop completes, the restripeOnDiskFailure value will be set back to its original value.
  • The IBM Storage Scale service will do a graceful shutdown and will no longer do a force unmount of the GPFS file system via mmunmount -f.
  • Seeing intermittent failure of one of the HDFS Transparency NameNodes at the startup due to the timing issue when both the NameNode HA and Kerberos are enabled has now been fixed.
  • The HDFS parameter dfs.replication is set to the mmlsfs -r value (Default number of data replicas) of the GPFS file system for gpfs.storage.type=shared instead of the Hadoop replication value of 3.
  • The Mpack installer (*.bin) file can now accept the license silently when the --accept-licence option is specified.

Summary of changes as updated, May 2019

Changes in HDFS Transparency 3.1.0-2
  • Issue fixed when a map reduce task fails after running for one hour when the Ranger is enabled.
  • Issue fixed when Hadoop permission settings do not work properly in a kerberized environment.
Documentation updates
  • Updated the Migrating IOP to HDP for BI 4.2.5 and HDP 2.6 information.

Summary of changes as updated, March 2019

Changes in Mpack version 2.7.0.3
  • Supports dual network configuration
  • Issue fixed to look only at the first line in the shared_gpfs_node.cfg file to get the host name for shared storage so the deployment of shared file system would not hang.
  • Removed gpfs_base_version and gpfs_transparency_version fields from the IBM Storage Scale service configuration GUI. This removes the restart all that is required after IBM Storage Scale is deployed.
  • Mpack can now find the correct installed HDP version when multiple HDP versions are seen.
  • IBM Storage Scale service is now able to handle hyphenated file system names so that the service will be able to start properly during file system mount.
  • IBM Storage Scale entry into system_action_definitions.xml is fixed. Therefore, the IBM Storage Scale </actionDefinition> ending tag is not on the same line as the </actionDefinitions> tag. Otherwise, there is a potential installation issue when a new service is added after IBM Storage Scale service because the new service is added in between the IBM Storage Scale entry and the </actionDefinition></actionDefinitions> line.
HDFS Transparency 3.1.0-1
  • Fixed Hadoop du to calculate all files under all subdirectories for the user even when the files have not been accessed.
  • Supports ViewFS in HDP 3.1 with Mpack 2.7.0.3.

Summary of changes as updated, February 2019

Changes in Mpack version 2.7.0.2
  • Supports HDP 3.1.
  • SLES 12 SP3 support for new installs on x86 64 only.
  • Upgrade the HDFS Transparency on all nodes in the IBM Storage Scale cluster instead of just upgrading it only on the NameNode and DataNodes.

Summary of changes as updated, December 2018

Changes in Mpack version 2.7.0.1
  • Supports HDP 3.0.1.
  • Supports preserving Kerberos token delegation during NameNode failover.
  • IBM Storage Scale service Stop All/Start All service actions now support the best practices for IBM Storage Scale stop/start as per Restarting a large IBM Storage Scale cluster topic in the IBM Storage Scale: Administration Guide.
  • The HDFS Block Replication parameter, dfs.replication, is automatically set to match the actual value of the IBM Storage Scale Default number of data replicas parameter, defaultDataReplicas, when adding the IBM Storage Scale service for remote mount storage deployment model.
HDFS Transparency 3.1.0-0
  • Supports preserving Kerberos token delegation during NameNode failover.
  • Fixed CWE/SANS security exposures in HDFS Transparency.
  • Supports Hadoop 3.1.1

Summary of changes as updated, October 2018

Changes in Mpack version 2.4.2.7
  • Supports preserving Kerberos token delegation during NameNode failover.
  • IBM Storage Scale service Stop All/Start All service actions now support the best practices for IBM Storage Scale stop/start as per Restarting a large IBM Storage Scale cluster topic in the IBM Storage Scale: Administration Guide.
HDFS Transparency 2.7.3-4
  • Supports preserving Kerberos token delegation during NameNode failover.
  • Supports native HDFS encryption.
  • Fixed CWE/SANS security exposures in HDFS Transparency.

Summary of changes as updated, August 2018

Changes in Mpack version 2.7.0.0
  • Supports HDP 3.0.
Changes in HDFS Transparency version 3.0.0-0
  • Supports HDP 3.0 and Mpack 2.7.0.0.
  • Supports Apache Hadoop 3.0.x.
  • Support native HDFS encryption.
  • Changed IBM Storage Scale configuration location from /usr/lpp/mmfs/hadoop/etc/ to /var/mmfs/hadoop/etc/ and default log location for open source Apache from /usr/lpp/mmfs/hadoop/logs to /var/log/transparency.
New documentation sections
  • Hadoop Scale Storage Architecture
  • Hadoop Performance tuning guide
  • Hortonworks Data Platform 3.X for HDP 3.0
  • Open Source Apache Hadoop

Summary of changes as updated, July 2018

Changes in Mpack version 2.4.2.6
  • HDP 2.6.5 is supported.
  • Mpack installation resumes from the point of failure when the installation is re-run.
  • The Collect Snap Data action in the IBM Storage Scale service in the Ambari GUI can capture the Ambari agents' logs in to a tar package under the /var/log/ambari.gpfs.snap* directory.
  • Use cases where the Ambari server and the GPFS main are colocated on the same host but are configured with multiple IP addresses are handled within the IBM Storage Scale service installation.
  • On starting IBM Storage Scale from Ambari, if a new kernel version is detected on the IBM Storage Scale node, the GPFS portability layer is automatically rebuilt on that node.
  • On deploying the IBM Storage Scale service, the Ambari server restart is not required. However, the Ambari server restart is still required when running the Service Action > Integrate Transparency or Unintegrate Transparency from the Ambari UI.

Summary of changes as updated, May 2018

Changes in HDFS Transparency 2.7.3-3
  • Non-root password-less login of contact nodes for remote mount is supported.
  • When the Ranger is enabled, uid greater than 8388607 is supported.
  • Hadoop storage tiering is supported.
Changes in Mpack version 2.4.2.5
  • HDP 2.6.5 is supported.

Summary of changes as updated, February 2018

Changes in HDFS Transparency 2.7.3-2
  • Snapshot from a remote-mounted file system is supported.
  • IBM Storage Scale fileset-based snapshot is supported.
  • HDFS Transparency and IBM Storage Scale Protocol SMB can coexist without the SMB ACL controlling the ACL for files or directories.
  • HDFS Transparency rolling upgrade is supported.
  • Zero shuffle for IBM ESS is supported.
  • Manual update of file system configurations when root password-less access is not available for remote cluster is supported.
Changes in Mpack version 2.4.2.4
  • HDP 2.6.4 is supported.
  • IBM Storage Scale admin mode central is supported.
  • The /etc/redhat-release file workaround for CentOS deployment is removed.

Summary of changes as updated, January 2018

Changes in Mpack version 2.4.2.3
  • HDP 2.6.3 is supported.

Summary of changes as updated, December 2017

Changes in Mpack version 2.4.2.2
  • The Mpack version 2.4.2.2 does not support migration from IOP to HDP 2.6.2. For migration, use the Mpack version 2.4.2.1.
  • From IBM Storage Scale Mpack version 2.4.2.2, new configuration parameters have been added to the Ambari management GUI. These configuration parameters are as follows:

    gpfs.workerThreads defaults to 512.

    NSD threads per disk defaults to 8.

    For IBM Storage Scale version 4.2.0.3 and later, gpfs.workerThreads field takes effect and gpfs.worker1Threads field is ignored. For versions lower than 4.2.0.3, gpfs.worker1Threads field takes effect and gpfs.workerThreads field is ignored.

    Verify if the disks are already formatted as NSDs - defaults to yes

  • The default values of the following parameters have changed. The new values are as follows:

    gpfs.supergroup defaults to hdfs,root now instead of hadoop,root.

    gpfs.syncBuffsPerIteration defaults to 100. Earlier it was 1.

    Percentage of Pagepool for Prefetch defaults to 60 now. Earlier it was 20.

    gpfs.maxStatCache defaults to 512 now. Earlier it was 100000.

  • The default maximum log file size for IBM Storage Scale has been increased to 16 MB from 4 MB.

Summary of changes as updated, October 2017

Changes in Mpack version 2.4.2.1 and HDFS Transparency 2.7.3-1
  • The GPFS Ambari integration package is now called the IBM Storage Scale Ambari management pack (in short, management pack or MPack).
  • Mpack 2.4.2.1 is the last supported version for BI 4.2.5.
  • IBM Storage Scale Ambari management pack version 2.4.2.1 with HDFS Transparency version 2.7.3.1 supports BI 4.2/BI 4.2.5 IOP migration to HDP 2.6.2.
  • The remote mount configuration in Ambari is supported. (For HDP only)
  • Support for two IBM Storage Scale file systems/deployment models under one Hadoop cluster/Ambari management. (For HDP only)

    This allows you to have a combination of IBM Storage Scale deployment models under one Hadoop cluster. For example, one file system with shared-nothing storage (FPO) deployment model along with one file system with shared storage (ESS) deployment model under single Hadoop cluster.

  • Metadata operation performance improvements for Ranger enabled configuration.
  • Introduction of Short circuit write support for improved performance where HDFS client and Hadoop DataNodes are running on the same node.