Integrating HDFS Transparency

You must plan a cluster maintenance window, and prepare for the cluster down time when integrating the HDFS Transparency with the native HDFS. After each integration, you must run the ambari-server restart on the Ambari server node. Ensure that all the services are stopped.

To integrate the HDFS Transparency with the native HDFS:
  1. On the dashboard, click Services > Stop All1 to stop all services. Verify that all services are stopped. If not, stop the services.

    1For FPO cluster, do not run STOP ALL from the Ambari GUI. Refer to the Limitations > General section on how to properly stop IBM Spectrum® Scale.

  2. Click Spectrum Scale > Actions > Integrate Transparency.
    Figure 1. IBM® SPECTRUM SCALE INTEGRATE TRANSPARENCY
    IBM SPECTRUM SCALE INTEGRATE TRANSPARENCY
  3. On the Ambari server node, run the ambari-server restart command to restart the Ambari server.
  4. Log back in to the Ambari GUI.
  5. Start all the services from Ambari GUI. The Hadoop cluster starts using IBM Spectrum Scale and the HDFS Transparency. The HDFS dashboard displays the NameNode and DataNode status of the HDFS Transparency.
On the HDFS dashboard, check the NameNode and DataNodes status.
Note: JournalNodes are not used when IBM Spectrum Scale service is integrated.
NameNode and DataNode status
Command verification
To verify that the HDFS Transparency is available, use the following command to check the connector state:
# Ensure all node GPFS state are active 

/usr/lpp/mmfs/bin/mmgetstate -a 

# Ensure all the NameNode and DataNodes are running.  

/usr/lpp/mmfs/hadoop/sbin/mmhadoopctl connector getstate

For more information on how to verify the HDFS transparency integration state, see Verifying Transparency integration state.

Cluster environment

After the IBM Spectrum Scale service is deployed, IBM Spectrum Scale HDFS Transparency is used instead of HDFS. HDFS Transparency inherits the native HDFS configuration and adds the additional changes for the HDFS Transparency to function correctly.

After IBM Spectrum Scale is deployed, a new HDFS configuration set V2 is created, and is visible in the HDFS Service Dashboard > CONFIG HISTORY.