Manually updating the configuration

Edit online

HDFS has both the server and client configurations. Unlike Cloudera Hortonworks Data Platform (HDP) Ambari that manages the HDFS server and client configurations with a common set of .xml files, Cloudera Manager (CM) configures the HDFS server and client configuration separately.

For more information, see Cloudera Manager server and client configuration section in the CDP Private Cloud Base documentation.

After IBM Storage Scale is integrated, Cloudera Manager no longer manages the HDFS server-side configurations. Cloudera Manager manages only the HDFS client-side configurations. This aligns with the separation of compute and storage architecture between Cloudera CDP Private Cloud Base and IBM Storage Scale.

The following table lists an example of how the hdfs-site.xml parameters specific to the server and the client roles are managed:

Table 1. Example showing hdfs-site.xml parameters management
Configuration files	Configuration management	Comments
HDFS server configuration	IBM Storage Scale CCR repository will sync the .xml files into the /var/mmfs/hadoop/etc/hadoop directory
HDFS server configuration	The values in Cloudera Manager GUI > IBM Storage Scale service > Transparency NameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml are synced to a private per-process directory, under /var/run/cloudera-scm-agent/process/process-name. For more information, see Cloudera Manager server and client configuration.	These Configurations are presented in Cloudera Manager GUI but are not used by IBM Storage Scale HDFS Transparency.
HDFS client configurations	The values in Cloudera Manager GUI > IBM Storage Scale service > Configuration > HDFS client Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml are synced to the /etc/hadoop/conf directory on the nodes that have the IBM Storage Scale Gateway roles so that it can be consumed by other Cloudera services and HDFS clients.	This is HDFS client shipped with CDP Private Cloud Base that is leveraged by other CDP services.
HDFS client configurations	The HDFS .xml files are located in the /var/mmfs/hadoop/etc/hadoop directory.	This is the HDFS client shipped with HDFS Transparency. This HDFS client is not commonly used in a CDP integrated environment.

Important: The procedure to update the configuration varies depending on which HDFS component is being changed.

HDFS server components:
- HDFS Transparency NameNodes
- HDFS Transparency DataNodes
HDFS client components:
- The hdfs, webhdfs commands and Java APIs from the HDFS client shipped with CDP Private Cloud Base .
- The hdfs, webhdfs commands and Java APIs from IBM Storage Scale under /usr/lpp/mmfs/hadoop/bin/ directory.

Follow the specific process to update the configuration based on the server or client configurations from CDP Private Cloud Base or IBM Storage Scale HDFS Transparency:

Updating only the HDFS server configurations
Updating only the HDFS client (CDP Private Cloud Base) configurations
Updating Ranger configurations
Updating Kerberos configurations

Updating only the HDFS server configurations

To update the server-side configuration (for example, dfs.namenode.handler.count value), run the following steps:

Stop the HDFS Transparency services from the Cloudera Manager GUI by clicking Cloudera Manager > IBM Storage Scale service > Stop.
Log in to one of the CES HDFS NameNodes and update the server-side configuration by running the mmhdfs config set command and then uploading the changed configuration to the IBM Storage Scale CCR repository using the mmhdfs config upload command.
```
# mmhdfs config set hdfs-site.xml -k dfs.namenode.handler.count=800
# /usr/lpp/mmfs/hadoop/bin/mmhdfs config upload
```
For more information on the mmhdfs command, see IBM Storage® Scale: Command and Programming Reference Guide.
Start the HDFS Transparency services from the Cloudera Manager GUI by clicking Cloudera Manager > IBM Storage Scale service > Start.
Note: You must start the HDFS Transparency services from Cloudera Manager so that Cloudera can display the states of the NameNodes and DataNodes properly.

Updating only the HDFS client (CDP Private Cloud Base) configurations

To change the client-only configuration (for example, adding or updating the dfs.client.* values in hdfs-site.xml), run the following steps:

On the Cloudera Manager GUI, click Cloudera Manager > IBM Storage Scale service > Configuration > HDFS client Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml > Update the configuration > Save.
On the Cloudera Manager GUI, click Deploy the client configuration to propagate the updated client configuration from the Cloudera Manager database to /etc/hadoop/conf.
Note: You do not need to restart the HDFS Transparency service for the changes to take effect.

Updating Ranger configurations

A ranger is closely integrated with the HDFS server and client. The Ranger plug-in runs within the NameNode process space. When the Ranger plug-in is enabled for HDFS, ranger-specific .xml files (ranger-hdfs-security.xml, ranger-hdfs-policymgr-ssl.xml and ranger-hdfs-audit.xml) are generated within a private directory specific to the HDFS Transparency NameNode process, under the /var/run/cloudera-scm-agent/process/process-name directory. For more information, see Cloudera Manager server and client configuration.

When you restart the HDFS Transparency NameNodes from the Cloudera Manager GUI, the Ranger configuration files are synced to the /var/mmfs/hadoop/etc/hadoop HDFS Transparency configuration directory. But these updates are not uploaded to the IBM Storage Scale CCR repository. Therefore, any Ranger specific configuration changes require a workaround to get into the CCR.

To start the HDFS Transparency NameNodes correctly, update the IBM Storage Scale CCR by following NameNodes do not start after updating the Ranger configuration.

Updating Kerberos configurations

Cloudera Manager does not manage Kerberos for the IBM Storage Scale service because the CDP Private Cloud Base cluster and the CES HDFS cluster are different clusters and are loosely integrated. Therefore, Kerberos setup needs to be manually enabled first on the CES HDFS cluster.

In the Cloudera Manager GUI, the Enable Kerberos action under Cluster name > Action has no effect on the HDFS server-side configuration. This enablement only enables Kerberos for the HDFS client-side. The HDFS client-side configuration files under /etc/hadoop/conf are updated to reflect the updates from the Cloudera Manager Kerberos enablement.

To make Kerberos-specific changes to HDFS, see Updating only the HDFS server configurations.

Note:

Cloudera Manager requires the following Kerberos-specific information from HDFS Transparency during the initial deployment to create the configuration files and directories properly:
- NameNode keytab location (parameters: spectrumscale_keytab, service-wide)
- HDFS Principal Name (parameters: scale_hdfs_principal_name, service-wide)
If the default NameNode principal name (nn) or NameNode keytab path (/etc/security/keytab/nn.service.keytab) on the HDFS server side is changed, the corresponding parameters must also be changed in Cloudera Manager.