IBM Support

Fix Cacti Log error: "Error opening cluster: Bad configuration environment"

Troubleshooting


Problem

After adding an LSF cluster to RTM server, you may notice that RTM is having trouble monitoring the cluster. The cluster status will be "Down" and there will be errors in Cacti log complaining about bad configuration. This article describes how to resolve this problem.

Symptom

There are two symptoms for this specific issue:

1. The LSF cluster is in "Down" status in RTM web console.

2. RTM's cacti log shows following error:


FATAL: Error opening cluster: Bad configuration environment

Cause

The cause of this issue is a problem with LSF cluster configuration. Typically, cluster parameters, such as LSF management hostname, LIM port, and LSF primary Administrators have not been set correctly.

Resolving The Problem

First, you should double check that all the cluster configurations are set correctly.

1. Log in to RTM web console as administrator

2. Open the following page:

CONSOLE >> Grid Management >> Clusters >> your_cluster_name

Where the your_cluster_name is the name of the cluster with status "Down".

3. Click the Configuration tab and check following parameters are set correctly:

a) LSF Management LIM hostname

b) LSF Management LIM Port

c) EGO Enabled

4. Click Save to save the changes.

If any of the parameters were set wrongly before, you should see cluster status change to OK after 5 minutes.

If the above steps did not help, try replacing the RTM's LSF configuration files with the actual LSF configuration files from your LSF Management server. Do following

1. Backup LSF configuration files on your RTM server

# cp -p /opt/rtm/etc/<cluster_id>/*.conf /var/tmp

where the cluster_id is the ID of your LSF cluster with status Down. You can get the cluster ID from RTM web console Clusters page under CONSOLE>>Grid Management section.

2. Copy the lsf.conf and ego.conf files from your LSF Management server to the RTM server. For example, put them in /tmp directory temporarily.

3. Replace the lsf.conf and ego.conf files under /opt/rtm/etc/<cluster_id> directory with the files you copied from the LSF Management server.

With the original lsf.conf and ego.conf file in place, you should not see following error in the Cacti log:


<code>FATAL: Error opening cluster: Bad configuration environment</code>

[{"Product":{"code":"SSVMSD","label":"Platform RTM"},"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Component":"--","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"Version Independent","Edition":"","Line of Business":{"code":"","label":""}},{"Product":{"code":"SSZT2D","label":"IBM Spectrum LSF RTM"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":" ","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}},{"Product":{"code":"SSZT2D","label":"IBM Spectrum LSF RTM"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":" ","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

More support for:
Platform RTM

Software version:
Version Independent

Document number:
678925

Modified date:
16 June 2021

UID

isg3T1020823

Manage My Notification Subscriptions