Manually updating the configuration
HDFS has both the server and client configurations. Unlike Cloudera Hortonworks Data Platform (HDP) Ambari that manages the HDFS server and client configurations with a common set of .xml files, Cloudera Manager (CM) configures the HDFS server and client configuration separately.
For more information, see Cloudera Manager server and client configuration section in the CDP Private Cloud Base documentation.
After IBM Storage Scale is integrated, Cloudera Manager no longer manages the HDFS server-side configurations. Cloudera Manager manages only the HDFS client-side configurations. This aligns with the separation of compute and storage architecture between Cloudera CDP Private Cloud Base and IBM Storage Scale.
Configuration files | Configuration management | Comments |
---|---|---|
HDFS server configuration | IBM Storage Scale CCR repository will sync the .xml files into the /var/mmfs/hadoop/etc/hadoop directory | |
The values in /var/run/cloudera-scm-agent/process/process-name. For more information, see Cloudera Manager server and client configuration. | are synced to a private per-process directory, underThese Configurations are presented in Cloudera Manager GUI but are not used by IBM Storage Scale HDFS Transparency. | |
HDFS client configurations | The values in /etc/hadoop/conf directory on the nodes that have the IBM Storage Scale Gateway roles so that it can be consumed by other Cloudera services and HDFS clients. | are synced to theThis is HDFS client shipped with CDP Private Cloud Base that is leveraged by other CDP services. |
The HDFS .xml files are located in the /var/mmfs/hadoop/etc/hadoop directory. | This is the HDFS client shipped with HDFS Transparency. This HDFS client is not commonly used in a CDP integrated environment. |
- HDFS server components:
- HDFS Transparency NameNodes
- HDFS Transparency DataNodes
- HDFS client components:
- The hdfs, webhdfs commands and Java APIs from the HDFS client shipped with CDP Private Cloud Base .
- The hdfs, webhdfs commands and Java APIs from IBM Storage Scale under /usr/lpp/mmfs/hadoop/bin/ directory.
Updating only the HDFS server configurations
- Stop the HDFS Transparency services from the Cloudera Manager GUI by clicking .
- Log in to one of the CES HDFS NameNodes and update the server-side configuration by running the
mmhdfs config set command and then uploading the changed configuration to the IBM
Storage Scale CCR repository using the mmhdfs config upload
command.
# mmhdfs config set hdfs-site.xml -k dfs.namenode.handler.count=800 # /usr/lpp/mmfs/hadoop/bin/mmhdfs config upload
For more information on the mmhdfs command, see IBM Storage Scale: Command and Programming Reference Guide.
- Start the HDFS Transparency services from the Cloudera Manager GUI by clicking
Note: You must start the HDFS Transparency services from Cloudera Manager so that Cloudera can display the states of the NameNodes and DataNodes properly.
.
Updating only the HDFS client (CDP Private Cloud Base) configurations
- On the Cloudera Manager GUI, click .
- On the Cloudera Manager GUI, click /etc/hadoop/conf.Note: You do not need to restart the HDFS Transparency service for the changes to take effect.
to propagate the updated client configuration from the
Cloudera Manager database to
Updating Ranger configurations
A ranger is closely integrated with the HDFS server and client. The Ranger plug-in runs within the NameNode process space. When the Ranger plug-in is enabled for HDFS, ranger-specific .xml files (ranger-hdfs-security.xml, ranger-hdfs-policymgr-ssl.xml and ranger-hdfs-audit.xml) are generated within a private directory specific to the HDFS Transparency NameNode process, under the /var/run/cloudera-scm-agent/process/process-name directory. For more information, see Cloudera Manager server and client configuration.
When you restart the HDFS Transparency NameNodes from the Cloudera Manager GUI, the Ranger configuration files are synced to the /var/mmfs/hadoop/etc/hadoop HDFS Transparency configuration directory. But these updates are not uploaded to the IBM Storage Scale CCR repository. Therefore, any Ranger specific configuration changes require a workaround to get into the CCR.
To start the HDFS Transparency NameNodes correctly, update the IBM Storage Scale CCR by following NameNodes do not start after updating the Ranger configuration.
Updating Kerberos configurations
Cloudera Manager does not manage Kerberos for the IBM Storage Scale service because the CDP Private Cloud Base cluster and the CES HDFS cluster are different clusters and are loosely integrated. Therefore, Kerberos setup needs to be manually enabled first on the CES HDFS cluster.
In the Cloudera Manager GUI, the Enable Kerberos action under has no effect on the HDFS server-side configuration. This enablement only enables Kerberos for the HDFS client-side. The HDFS client-side configuration files under /etc/hadoop/conf are updated to reflect the updates from the Cloudera Manager Kerberos enablement.
To make Kerberos-specific changes to HDFS, see Updating only the HDFS server configurations.- Cloudera Manager requires the following Kerberos-specific information from HDFS Transparency
during the initial deployment to create the configuration files and directories properly:
- NameNode keytab location (parameters: spectrumscale_keytab, service-wide)
- HDFS Principal Name (parameters: scale_hdfs_principal_name, service-wide)
- If the default NameNode principal name (nn) or NameNode keytab path (/etc/security/keytab/nn.service.keytab) on the HDFS server side is changed, the corresponding parameters must also be changed in Cloudera Manager.