How to change the configuration for tuning

This topic lists the steps to change the configuration for tuning.

Refer to the following table to find the correct configuration directory and take the corresponding steps to update the configuration for tuning:
Table 1. Configuration directories and corresponding tuning actions
Configuration Location How to change the configuration
HortonWorks HDP /etc/hadoop/conf
  • Change the configuration from Ambari for HDFS, Yarn, MapReduce2, Hive, etc.
  • Restart the services to sync the configuration into /etc/Hadoop/conf on each Hadoop node.
Community Apache Hadoop $HADOOP_HOME/etc/hadoop
  • Modify the configuration xml files directly.
  • Take scp to sync the configurations into all other nodes.
Standalone Spark Distro Refer the guide from your Spark distro Usually, if taking HDFS schema as data access in Spark, the hdfs-site.xml and core-site.xml are defined by HADOOP_CONF_DIR in $SPARK_HOME/spark-env.sh. If so, you need to modify hdfs-site.xml and core-site.xml accordingly for tuning and sync the changes to all other Spark nodes.
HDFS Transparency /usr/lpp/mmfs/hadoop/etc/hadoop (for HDFS Transparency 2.7.x) or /var/mmfs/hadoop/etc/hadoop (for HDFS Transparency 3.0.x)
  • hdfs-site.xml and core-site.xml should be the same as the configuration for Hadoop or Spark.
  • If you take HortonWorks HDP, modify the configuration for gpfs-site.xml from Ambari/IBM StorageĀ® Scale and restart the HDFS service to sync the changes to all HDFS Transparency nodes.
  • If you take community Apache Hadoop, manually update gpfs-site.xml on one HDFS Transparency node and then run mmhadoopctl connector syncconf /usr/lpp/mmfs/hadoop/etc/hadoop (for HDFS Transparency 2.7.x) or mmhadoopctl connector syncconf /var/mmfs/hadoop/etc/hadoop (for HDFS Transparency 3.0.x) to sync the changes to all HDFS Transparency nodes.