Troubleshooting with the Execution Engine for Apache Hadoop diagnostics tool

Use the Execution Engine for Apache Hadoop diagnostics tool when you are troubleshooting problems with IBM Software Support.

This tool is used to gather all relevant disagnostics information about your environment.

Prerequisites

The Execution Engine for Apache Hadoop diagnostics tool is installed as part of the Hadoop Red Hat Package Manager (RPM), starting in Cloud Pak for Data 4.7.0. You must have the following prerequisites:

  1. The tool requires the parmiko python module to be installed on the edge node:

    Run pip list | grep paramiko to confirm whether this module is installed. If the module fails to install, run pip install paramiko.

  2. You must have your cluster administrator password readily available.

Information the tool collects from the system

  • Log files from the following directories on the edge node:
    • /var/log/dsxhi
    • /etc/hadoop/conf
    • /opt/ibm/dsxhi/conf
    • /opt/ibm/dsxhi/security
  • Cluster services configuration files from the cluster administrator console:
    • hive-site.xml
    • yarn-site.xml
    • hdfs-site.xml
  • Hadoop cluster information:
    • Basic cluster information (for example, name of the cluster and number of nodes)
    • CDH Version
    • RPM Version
    • JEG Version
    • LIVY Version
    • Python and Java versions from all the nodes of the cluster

Running the tool

Important: Do not run this tool unless you've been directed to by IBM Software Support.

The diagnostics tool is found in the /opt/ibm/dsxhi/bin/util directory as hee_diag.py.

  1. Run the tool:

    cd /opt/ibm/dsxhi/bin/util
    ./hee_diag.py
    
  2. Enter your cluster administrator password when prompted.

  3. After the diagnostics are completed, the resulting ZIP file is generated in /opt/ibm/dsxhi/hee-diag-suite/hee-diag-<YYYY-MM-DD>.zip.

  4. Send this ZIP file to IBM Software Support.

Parent topic: Troubleshooting Hadoop environments