Configuring the agent on Linux and AIX systems

To configure the agent on Linux and AIX® systems, run the configuration script and respond to prompts.

Procedure

  1. On the command line, run the following command:

     <install_dir>/bin/hadoop-agent.sh config
    

    Where install_dir is the installation directory of Hadoop agent.

    The agent is installed at the following default installation directory: /opt/ibm/apm/agent

  2. When the command line displays the following message, type 1 to continue with the configuration steps and press Enter.

     Edit Monitoring Agent for Hadoop setting? [1= yes, 2= No]
    
  3. When the command line displays the following message, type 1 to specify values for monitoring the Hadoop cluster with the Kerberos SPNEGO-based authentication enabled, and press Enter.

    Otherwise, type 2 and press Enter, and you can keep a blank value for the Realm name, KDC Hostname, SPNEGO principal name, and SPNEGO keytab file fields:

       Is Kerberos SPNEGO-based authentication for HTTP based Hadoop services in Hadoop cluster enabled: [ 1=Yes, 2=No ] (default is: 2)
    

    a. For the Realm name parameter, enter the name of the Kerberos realm that is used to create service principals. Usually, a realm name is the same as your domain name. For instance, if your computer is in the tivoli.ibm.com domain, the Kerberos realm name is TIVOLI.IBM.COM. This name is case sensitive.

    b. In the KDC Hostname field, enter the fully qualified domain name (FQDN) of the Key Distribution Center (KDC) host for the specified realm. You can also specify the IP address of the KDC host instead of FQDN. In case of Active Directory KDC, Domain controller is the KDC host.

    c. For the SPNEGO principal name parameter, enter the name of the Kerberos principal that is used to access SPNEGO authenticated REST endpoints of HTTP-based services. The name is case sensitive, and the name format is HTTP/fully_qualified_host_name@kerberos_realm

    d. For the SPNEGO keytab file parameter, enter the name of the keytab file for the SPNEGO service with its full path. The keytab file contains the names of Kerberos service principals and keys. This file provides direct access to Hadoop services without requiring a password for each service. The file can be located at the following path: etc/security/keytabs/. Ensure that the SPNEGO principal name and the keytab file belong to the same host. For instance, if the principal name is HTTP/abc.ibm.com@IBM.COM, the keytab file that is used must belong to the abc.ibm.com host. If the agent is installed on a remote computer, copy the keytab file of the principal to the remote computer at any path, and then specify this path for the SPNEGO keytab file parameter.

  4. When the command line displays the following message, type 1 to specify values for monitoring the Hadoop daemons-HDFS, YARN and MapReduce/MapReduce2 with the SSL enabled, and press Enter. Otherwise, type 2 and press Enter, and you can keep a blank value for the TrustStore file path and TrustStore Password fields:

    Are Hadoop daemons-HDFS, YARN and MapReduce/MapReduce2 SSL enabled [ 1=Yes, 2=No (default is: 2)
    

    a. In TrustStore file path field, specify the path of TrustStore file stored at your local machine. This file can be copied from the Hadoop cluster to your local machine and then used for configuration.

    b. In TrustStore Password field, specify the password you created while configuring the TrustStore file.

  5. When you are prompted to enter the details of the Hadoop cluster, specify an appropriate value for each of the following parameters, and press Enter.

    a. In the Unique Hadoop Cluster Name field, specify the unique name for the Hadoop cluster indicating Hadoop version and flavor. The maximum character limit for this field is 12.

    b. For the NameNode Hostname parameter, specify the host name of the node where the daemon process for NameNode runs, and press Enter.

    Attention: If you press Enter without specifying a host name, you are prompted to enter the host name.

    c. For the NameNode Port parameter, specify the port number that is associated with the daemon process for NameNode, and press Enter. The default port number is 50070.

    d. For the ResourceManager Hostname parameter, specify the host name of the node where the daemon process for ResourceManager runs, and press Enter.

    Attention: If you press Enter without specifying a host name, you are prompted to enter the host name.

    e. For the ResourceManager Port parameter, enter the port number that is associated with the daemon process for ResourceManager. The default port number is 8088.

  6. Optional: When you are prompted to add the details of the following parameters of the Hadoop cluster, accept the default value or specify an appropriate value for each of the following parameters, and press Enter:

    a. For the JobHistoryServer Hostname parameter, enter the host name of the node where the daemon process for JobHistoryServer runs.

    b. For the JobHistoryServer Port parameter, enter the port number that is associated with the daemon process for JobHistoryServer. The default port number is 19888.

    c. For the Additional NameNode Hostname parameter, enter the host name of the node where the daemon process for a Secondary or a Standby NameNode runs.

    d. For the Additional NameNode Port parameter, enter the port number that is associated with the daemon process for a Secondary or a Standby NameNode. The default port number for a Secondary NameNode is 50090. For a Standby NameNode, the default port number is 50070.

  7. Optional: When the command line displays the following message, enter 1 to add details of Standby ResourceMangers for high-availability cluster, and press Enter.

    Standby ResourceManager(s) in Hadoop Cluster [ 1=Yes, 2=No ] (default is: 2):
    
  8. When the command line displays the following message, specify 1 and press Enter to monitor Hadoop services in the Hadoop cluster that is managed by Ambari: Otherwise, retain the default value of 2 and press Enter. If you enable the monitoring of Hadoop services, specify a value for each of the following parameters of Ambari server, and press Enter:

    Monitoring of Hadoop services for Ambari based Hadoop installations [ 1=Yes, 2=No ] (default is: 2):
    

    a. When the command line displays the following message, type 1 to specify the values for Monitoring the Ambari Hadoop cluster with the Kerberos authentication enabled for the REST endpoints, and press Enter.

    Otherwise, type 2 and press Enter, and you can leave blank value for the fields, Realm name, KDC Hostname, Ambari principal name, Ambari keytab file, Ambari Server Hostname and Ambari Server Port.

    i. In the Realm name field, enter the name of the Kerberos realm that is used to create service principals.

    Usually, a realm name is the same as your domain name. For instance, if your computer is in the tivoli.ibm.com domain, the Kerberos realm name is TIVOLI.IBM.COM. This name is case sensitive.

    ii. In the KDC Hostname field, enter the fully qualified domain name (FQDN) of the Key Distribution Center (KDC) host for the specified realm. You can also specify the IP address of the KDC host instead of FQDN. In case of Active Directory KDC, Domain controller is the KDC host.

    iii. In the Ambari principal name field, enter the name of the Ambari principal that is used to access Kerberos authenticated REST endpoints of Ambari Server. The name is case sensitive, and the name format is ambari-server-username@kerberos_realm.

    iv. In the Ambari keytab file field, enter the name of the keytab file for the Ambari service with its full path, or click Browse and select the file. The keytab file contains the names of ambari service principals and keys. This file provides direct access to Rest endpoints of Ambari Server without requiring a password for each service. The file can be located at the following path: etc/security/keytabs/. If the agent is installed on a remote computer, copy the keytab file of the principal to the remote computer at the designated path, and then specify the path in the Ambari keytab file field.

    v. In the Ambari server Hostname field, enter the host name where the Ambari server runs.

    vi. In the Ambari server Port field, enter the port number that is associated with the Ambari server. The default port number is 8080.

    b. For the Ambari server Hostname parameter, enter the host name where the Ambari server runs.

    c. For the Ambari server Port parameter, enter the port number that is associated with the Ambari server. The default port number is 8080.

    d. For the Username of Ambari user parameter, enter the name of the Ambari user.

    e. For the Password of Ambari user parameter, enter the password of the Ambari user.

    f. When the command line displays the following message, type 1 to specify values for monitoring the Ambari services with SSL enabled, and press Enter. Otherwise, type 2 and press Enter, and you can leave a blank value for the TrustStore file path and TrustStore Password fields:

    Are Ambari Services SSL enabled [ 1=Yes, 2=No (default is: 2)
    

    i. In TrustStore file path field, specify the path of TrustStore file stored at your local machine. This file can be copied from the Hadoop cluster to your local machine and then used for configuration.

    ii. In TrustStore Password field, specify the password you created while configuring the TrustStore file.

    Note: If the values for the fields TrustStore file path and TrustStore Password are provided in Step 4 and the values are same, then you can leave blank values for the fields TrustStore file path and TrustStore Password.

  9. When the command line displays the following message, specify 1 and press Enter to monitor Cloudera Manager services in the Hadoop cluster:

    Monitoring of Cloudera Manager services for Cloudera Hadoop installations [ 1=Yes, 2=No ] (default is: 2):

    Otherwise, retain the default value of 2 and press Enter. If you enable the monitoring of Cloudera Manager services, specify a value for each of the following parameters of Cloudera Manager server, and press Enter:

    a. For the Cloudera Manager server Hostname parameter, enter the host name where the Cloudera Manager server runs.

    b. For the Cloudera Manager server Port parameter, enter the port number that is associated with the Cloudera Manager server. The default port number for HTTP based Cloudera Manager server is 7180.

    c. For the Username of Cloudera Manager server user parameter, enter the name of the Cloudera Manager server's user.

    d. For the Password of Cloudera Manager server user parameter, enter the password of the Cloudera Manager server's user

    e. When the command line displays the following message, type 1 to specify values for monitoring the Cloudera Manager services with SSL enabled, and press Enter. Otherwise, type 2 and press Enter, and you can keep a blank value for the TrustStore file path and TrustStore Password fields:

    Are Cloudera Manager Services SSL enabled [ 1=Yes, 2=No (default is: 2)

    i. In TrustStore file path, specify the path of TrustStore file stored at your local machine. This file can be copied from the Hadoop cluster to your local machine and then used for configuration.

    ii. In TrustStore Password, specify the password you created while configuring the TrustStore file.

    Note: If the values for the fields TrustStore file path and TrustStore Password are provided in Step 4 and are same, then the values for fields TrustStore file path and TrustStore Password can be kept as blank.

  10. When the command line displays the following message, select the appropriate Java™ trace level and press Enter:

    This parameter allows you to specify the trace level used by the Java providers Java trace level [ 1=Off, 2=Error, 3=Warning, 4=Information, 5=Minimum Debug, 6=Medium Debug, 7=Maximum Debug, 8=All ] (default is: 2)

  11. Optional: When the command line displays the following message, specify the arguments for the Java virtual machine, and press Enter. The list of arguments must be compatible with the version of Java that is installed along with the agent.

    This parameter allows you to specify an optional list of arguments to the java virtual machine JVM arguments (default is:)

  12. Optional: When the command line displays the following message, enter 1 to add the following details of Standby ResourceManagers, and press Enter:

    Edit Hadoop High Availability(HA) Cluster with Standby ResourceManagers settings, [1=Add, 2=Edit, 3=Del, 4=Next, 5=Exit] (default is: 5): 1

    a. For the Standby ResourceManager Hostname parameter, enter the host name of the node where the daemon process for Standby ResourceManger runs.

    b. For the Standby ResourceManager Port, enter the port number that is associated with the daemon process for Standby ResourceManager. The default port number is 8088.

    c. When you are prompted, enter 1 to add more Standby ResourceManagers, and repeat steps a and b, or enter 5 to go to the next step.

    • To edit the configuration settings of a specific Standby ResourceManager, type 4 and press Enter until you see the host name of the required Standby ResourceManager.

    • To remove a Standby ResourceManager, type 3 and press Enter after you see the host name of the Standby ResourceManger that you want to remove.

  13. When you are prompted, enter the class path for the JAR files that the Java API data provider requires, and press Enter.

    The specified configuration values are saved, and a confirmation message is displayed.

  14. Run the following command to start the agent: install_dir/bin/hadoop-agent.sh start

What to do next

  1. Enable the subnode events to view eventing thresholds of the Hadoop agent. For information about enabling subnode events, see Configuring the dashboard for viewing Hadoop events.
  2. Log in to the IBM Cloud Pak console to view the data that is collected by the agent in the dashboards. For more information about using the console, see Starting the Cloud App Management UI.