Troubleshooting the sensor

This topic describes common problems that occur with the IBM® Tivoli® Monitoring Scope sensor and presents solutions for those problems.

Computer systems that are outside of the defined scope are created

Problem
During a discovery, some computer systems that are outside of the defined scope are created.
Solution
If the discoverITMEndpoints attribute in the discovery profile for this sensor is set to true, the sensor, during a discovery, creates a computer system for each Tivoli Monitoring endpoint that is known to the Tivoli Enterprise Portal Server. This creation occurs even if an endpoint is outside of the initial discovery scope that included the portal server.

Updates made to the generated Tivoli Monitoring scope using the Discovery Management Console are overwritten

Problem
Updates that have been made to the generated Tivoli Monitoring scope in the previous discovery using the Discovery Management Console are overwritten.
Solution
During a Level 1 discovery, a new scope is created based on the name of the Tivoli Enterprise Portal Server. This scope is overwritten the next time that the portal server is discovered during a Level 1 or Level 2 discovery.

To change the generated Tivoli Monitoring scope, create a scope with a different name that contains the elements of the generated scope.

In a large Tivoli Monitoring environment, the sensor fails with a timeout error

Problem
In a large Tivoli Monitoring environment, the Tivoli Monitoring Scope sensor fails with a timeout error.
Solution
In the etc/collation.properties file, edit the following property, where value is the number of milliseconds allowed for the sensor to run (for example, 60000 is 1 minute):
com.collation.discover.agent.ITMScopeSensor.timeout=value

The sensor fails with a timeout error when slow network links or many router hops exist between the target systems and the Tivoli Enterprise Portal Server or TADDM

Problem
The Tivoli Monitoring Scope sensor fails with a timeout error. Slow network links or many router hops exist between the target systems and the Tivoli Enterprise Portal Server or TADDM. The environment includes Windows, Linux®, and UNIX systems.
Solution
This problem is caused by TCP buffer settings. Because the buffer sizes are sometimes too small, poor performance occurs with the TADDM sensors and the Tivoli Enterprise Portal Server.

To solve this problem, complete the following steps, depending on the operating system:

On AIX® systems:
  1. Run the following commands:
    /usr/sbin/no -o tcp_sendspace=32768
    /usr/sbin/no -o tcp_recvspace=32768  
  2. Restart the TADDM server.
On Linux systems:
  1. Edit the /etc/sysctl.conf file with the following settings:
    # increase TCP maximum buffer size
    		net.core.rmem_max = 16777216
    		net.core.wmem_max = 16777216
    
    # increase Linux autotuning TCP buffer limits
    
    # min, default, and maximum number of bytes to use
    		net.ipv4.tcp_rmem = 4096 87380 16777216
    		net.ipv4.tcp_wmem = 4096 65536 16777216
  2. Run sysctl -p to read in and set the new values.
  3. Restart the TADDM server.
On Solaris systems:
  1. Run the following commands:
    /usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 32768
    /usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 32768
  2. Restart the TADDM server.

Error message results from running the tacmd getDeployStatus command after deploying the discovery target support bundle

Problem
One or more of the following messages result from running the tacmd getDeployStatus command after deploying the discovery target support bundle:
  • Error Message: KDY1024E: The command /opt/IBM/ITM/bin/CandleAgent
    -h /opt/IBM/ITM start d7 did not start or stop agent.
    The command returned a return code.
  • Error Message: KDY1008E: The agent action INSTALL failed with
    a return code of for product code d7. The command
    /opt/IBM/ITM/tmaitm6/aix526/bin/kdy_xa -setCMS d7 produced the
    following error text: <Variable formatSpec="{4}">stdErr
    Text</Variable>.  The specified return code was received from
    the two-way translator.
  • Error Message: KDY1024E: The agent failed to respond to the
    command C:\itmagent\installITM\Batch\kincli  -startagent -akd7 
    did not start or stop agent. The command returned a 
    failure return code.
Solution
These messages do not indicate actual errors, because the discovery target support bundle is not intended to respond to the agent start or stop command. The Tivoli Monitoring cinfo command also does not list the support bundle, because the support bundle is an addition to the existing OS agent.
Verify that the discovery target support bundle is correctly installed on the discovery target. From the Tivoli Monitoring directory on the target computer, run the directory command as shown in the following example:
C:\Documents and Settings\Administrator>cd %CANDLEHOME%

C:\IBM\ITM>dir taddm
 Volume in drive C has no label.
 Volume Serial Number is B81D-9114

 Directory of C:\IBM\ITM\taddm

09/24/2010  06:38 PM    <DIR>          .
09/24/2010  06:38 PM    <DIR>          ..
09/24/2010  06:38 PM             6,656 Base64.exe
09/24/2010  06:38 PM             1,960 KD7WINNT.dsc
09/24/2010  06:38 PM             1,363 post.bat
09/24/2010  06:38 PM             4,280 pre.bat
09/24/2010  06:38 PM           249,856 TaddmTool.exe
09/24/2010  06:38 PM           474,624 TaddmTool.pdb
09/24/2010  06:38 PM           569,344 TaddmWmi.dll
09/24/2010  06:38 PM           106,496 TaddmWmi.exe
09/24/2010  06:38 PM             1,424 TaddmWmi.mof
09/24/2010  06:38 PM         2,968,576 TaddmWmi.pdb
              10 File(s)      4,384,579 bytes
               2 Dir(s)  10,931,712,000 bytes free
The discovery support bundle files must be present in the %CANDLE_HOME%\taddm directory.

When running the sensor for a Level 2 discovery on Windows target systems, multiple command windows open on the computer where the Tivoli Enterprise Portal Server is running

Problem
When you run the IBM Tivoli Monitoring Scope sensor for a Level 2 discovery on Windows target systems, multiple command windows open on the computer where the Tivoli Enterprise Portal Server is running.
Solution
The IBM Tivoli Monitoring Windows OS Agent is probably configured to run as a system service, and the option Allow Service to Interact with Desktop is enabled. Complete the following steps to correct this problem:
  1. Right-click the agent in the Manage Tivoli Monitoring Services program.
  2. Click Change Startup.
  3. In the Log on As pane of the window that opens, clear the Allow Service to Interact with Desktop check box.
  4. Click OK.
  5. Again, right-click the agent in the Manage Tivoli Monitoring Services program.
  6. Click Recycle.

Temporary files are in the log directory of the target system

Problem
During a Level 2 discovery through IBM Tivoli Monitoring, some commands fail on endpoints, which causes multiple KD7* files or session_script*.bat files to be in the log directory of the target system. These files might also be present for other reasons, such as a discovery that ended prematurely or a problem with the Tivoli Monitoring agent connection to the Tivoli Enterprise Monitoring Server.
Solution
The administrator can remove these files manually at any time that discovery is not running. Removing these files during a discovery can cause discovery to fail.

Trailing white spaces exist in the output from discovery targets

Problem
If you create custom server templates that run under the IBM Tivoli Monitoring Scope sensor, trailing white spaces (such as newline characters or carriage returns) might exist in the output from discovery targets.
Solution
To ensure that custom server templates provide the same output when used with the Tivoli Monitoring Scope sensor, remove white spaces in the server-side logic of the custom server template.

After upgrading IBM Tivoli Monitoring, errors occur during discovery

Problem
After upgrading IBM Tivoli Monitoring, errors might occur during discovery for the following reasons:
  • A result of updates to the Tivoli Monitoring libraries or agent tables
  • A result of updates to the TADDM discovery logic
Solution
If the errors result from updates to the Tivoli Monitoring libraries or agent tables, redo the following tasks:

If none of the above solutions works, make sure that the com.ibm.cdb.discover.ITM.https.strictChecking property in the collation.properties file is set to false. By default, this property is not added to the collation.properties file, which means that its default value is false. This property is used only with the SSL session. If you set it to true, the connection host name must match the certificate host name. Otherwise, the discovery fails.

Errors occur during discovery of a Tivoli Monitoring 6.2.2 environment

Problem
During the discovery of a Tivoli Monitoring Version 6.2.2 environment, the Tivoli Enterprise Monitoring Server might shut down unexpectedly, resulting in the following TADDM error messages:
  • CTJTD0203E The Computer System agent cannot retrieve the host 
    and IP information for the following computer system
  • CTJTD3000E Starting - An error occurs and the sensor timed out
Solution
Verify that the Tivoli Enterprise Monitoring Server process on the Tivoli Monitoring server is running, and if necessary, restart the Tivoli Enterprise Monitoring Server. This process might shut down unexpectedly due to too many proxy requests, which is related to a known problem with Tivoli Monitoring 6.2.2. For more information, see Tivoli Monitoring APAR IZ52960.2.

Tivoli Monitoring scope does not include all endpoints defined on the Tivoli Enterprise Portal Server

Problem
The Tivoli Monitoring scope created during a discovery does not include all the endpoints that are defined on the Tivoli Enterprise Portal Server.
Solution
Inactive endpoints and endpoints for which MAC addresses cannot be resolved are not included in a created scope set.

Targets are discovered by IBM Tivoli Monitoring session but not by SSH or WMI during a Level 2 discovery

Problem
When an endpoint is discovered by the IBM Tivoli Monitoring Scope sensor, future Level 2 discoveries use Tivoli Monitoring for discovery by default. A direct connection (SSH or WMI) is not used. This method is used even if the IBM Tivoli Monitoring Scope sensor is not included in the discovery profile.
Solution
To discover the endpoint through SSH or WMI, define the following property in the collation.properties file: com.ibm.cdb.session.allow.ITM.endpoint_ip_address=false.

See the TADDM Administrator's Guide for information about how to modify properties that affect how TADDM discovers Tivoli Monitoring endpoints.

Too many active report queries on the Tivoli Enterprise Portal Server

Problem
The following informational message is generated in the SessionSensor.log file:
KFWITM460E: Too many active report queries from client IPAddress;
 exceeding limit at number requests.
Solution
Increase the maximum number of pending requests. Edit the configuration settings on the Tivoli Enterprise Portal Server, on Windows operating systems edit the KFWENV file, and on Linux or UNIX operating systems edit the cq.ini file with the following settings:
KFW_REPORT_REQUEST_LIMIT_MAX=100 
KFW_REPORT_REQUEST_LIMIT=30 
KFW_REPORT_REQUEST_LIMIT_DURATION=300
The KFW_REPORT_REQUEST_LIMIT property specifies the normal limit of pending requests to the Tivoli Enterprise Portal Server from a single client. The KFW_REPORT_REQUEST_LIMIT_MAX specifies a temporary maximum limit of pending requests that can exceed the KFW_REPORT_REQUEST_LIMIT, only allowable during a burst of time defined by the KFW_REPORT_REQUEST_LIMIT_DURATION (in seconds).