Troubleshooting the sensor

This topic describes common problems that occur with the IBM® WebSphere® sensor and presents solutions for those problems.

Sensor does not start

Problem
The WebSphere Application Server sensor does not start.
Solution

To determine why the WebSphere Application Server sensor does not start, validate the following criteria on your WebSphere server:

  • The WebSphere process is running.
  • The command line is not truncated (the process that is running must match the template for the WebSphere Application Server).

    For Windows 2003/2008, Linux®, Solaris, AIX®, and Linux on System z® operating systems, the command line must contain the word WsServer.

  • The WebSphere Application Server was started as a service (on Windows 2000), or as a service or from the command line (Windows 2003 or Windows 2008).

If none of the preceding items appear to be the cause, check the system log and the WebSphere Application Server start logs for error messages.

WebSphere servers or nodes cannot be discovered

Problem
Some WebSphere servers or nodes cannot be discovered.
Solution
When the WebSphere server or node to be discovered is configured to use FQDN instead of plain IP address as its bootstrap address, the TADDM server must have access to a DNS server that is able to resolve that FQDN. Otherwise the information about this particular server or node cannot be discovered, even if the target scope is defined using the IP address.

Discovery of WebSphere Application Server is not logged

Problem
The discovery of the WebSphere Application Server is not logged in the DiscoverManager.log file. Because a local anchor is used for the discovery, the log messages are placed in to a separate file.
Solution
The log messages are placed in the following log files, where hostname is the fully qualified domain name of the TADDM server:
  • local-anchor*.hostname.WebSphereAgent.log
  • local-anchor*.hostname.WebSphereNodeSensor.log

Errors when security is enabled on WebSphere Application Server

Problem
The following types of error messages are displayed:
  • ERROR cdb.WebSphereAgentDelegate - [WebSphereAgentDelegate.E.1]
    discover() failed with exception : java.lang.Exception: 
    Unable to connect to the WebSphere server at 
    9.48.158.37:8,880 - ADMC0016E: 
    The system cannot create a SOAP connector 
    to connect to host 9.48.158.37 at port 8880...
  • ERROR cdb.WebSphereJMXUtils - An error occurred,
    unable to establish a repository connection 
    using the credentials raleigh-was60: 
    com.ibm.websphere.management.exception.AdminException: 
    javax.management.JMRuntimeException: ADMN0022E: 
    Access is denied for the getServerConfig operation on 
    FileTransferServer MBean because of insufficient
    or empty credentials.
These errors can occur for any of the following reasons:
  • No credentials exist in the access list for the WebSphere Application Server.
  • In the credentials for the WebSphere Application Server, the certificates are not correct or have not been entered through the access list.
  • In the credentials for the WebSphere Application Server, the password is incorrect.
Solution
Add the credentials in the access list for the WebSphere Application Server. Correct the certificates, enter the certificates through the access list, or provide the correct password.

Failure to make a JMX connection

Problem
The following type of error occurs:
Sensor failed in remote server:
Unable to connect to WebSphere server at 10.0.1.69:8880 - ADMC0016E:
Could not create SOAP Connector to connect to host 10.0.1.69 at port 8880
This type of error indicates the following problems:
  • A missing or incorrect certificate or an incorrect user ID and password. The following example shows a sample root cause:
    [SOAPException: faultCode=SOAP-ENV:Client;
    msg=Error opening socket:
    javax.net.ssl.SSLHandshakeException: certificate expired;
    targetException=java.lang.IllegalArgumentException:
    Error opening socket:
    javax.net.ssl.SSLHandshakeException: certificate expired]
  • A firewall that is preventing a connection to the WebSphere Application Server through the SOAP port.
  • The WebSphere Application Server might not be in a good state, even though the process shows up in the process table or Windows services list. To test the state of the WebSphere Application Server, try to connect to it using the wsadmin WebSphere administrative utility. If the wsadmin utility fails, the sensor has problems also.
Solution
Use either of the following solutions:
  • Run one of the following programs, which tests the JMX connection to verify credentials and connectivity:
    • For Linux, AIX, and Linux on System z operating systems: $COLLATION_HOME/bin/testwasconnection.sh. Instructions for running this program are in the testwasconnection.sh file.
    • For Windows systems: %COLLATION_HOME%\bin\testwasconnection.bat. Instructions for running this program are in the testwasconnection.bat file.
  • Make sure that your access list is defined correctly. If you discover WAS on z/OS and you want to use an access list entry with a scope restriction, you must include the IP address of your WebSphere server in the discovery scope in addition to the IP address of the host where the WebSphere seed file is located.

Sensor fails on a JMX query

Problem
The sensor fails on a JMX query with the following message:
failed on JMX query--check server health and retry
This error indicates that the configuration setup might be corrupted.
Solution
Check the logs to see what is being queried and whether that value is readable in the WebSphere Application Server console. This error usually occurs because discovery is run overnight, and WebSphere Application Servers are down for maintenance reasons. In this case, restart the server, and try the discovery again.

Data store error - storage of data taking too long to collect

Problem
Storage of data collected from a WebSphere discovery is taking too long.
Solution
The database tuning script was not run before TADDM schema creation. Before creating the TADDM schema, run the following database tuning script:
  • For non-Windows systems:
    $COLLATION_HOME/bin/gen_db_stats.jy
  • For Windows systems:
    %COLLATION_HOME%\bin\gen_db_stats.bat

WebSphere Application Server is down

Problem
The WebSphere Application Server is down for one of the following reasons:
  • TADDM runs when a WebSphere Application Server is in maintenance, and a discovery does not complete. The local-anchor*.hostname.WebSphereAgent.log file or local-anchor*.hostname.WebSphereNodeSensor.log file might display the following error message:
    INFO cdb.AnchorServer[main] - [AnchorServer.I.0] server no longer
    accepting new connections
  • An error message states that the query cannot be completed.
Solution
Verify that the WebSphere Application Server is functioning properly.

Sensor does not show as much data as it did in previous releases of TADDM

Problem
The Details window for WebSphere cells, nodes, and servers does not show as much detail as it did in previous TADDM releases, and many of the tabs in the window have no data.
Solution
TADDM implements the following discovery levels:
  • Shallow
  • Medium
  • Deep

The default discovery level for the WebSphere Application Server sensor is shallow.

To obtain more detail about the WebSphere Application Server, create a discovery sensor configuration for the WebSphereCellSensor sensor, and in the Sensor Configuration window. Set the value of the mediumDiscoveryLevel property or the deepDiscoveryLevel property to true.

WebSphere sensor fails during WebSphere discovery on an AIX operating system due to problems with the AIX ps command

Problem
On some AIX operating systems, running the UNIX ps command returns truncated Java™ CLASSPATH strings. The strings are not recognized by the TADDM WebSphere sensor, resulting in a failed discovery.
Solution
Upgrade to at least the AIX 5.3. FP5 (5.3.0.50) version. This version and later versions of AIX return the full Java CLASSPATH strings.

Message CTJDT0736W is shown

Problem
Insufficient credentials exist in the access list for the Secure Shell (SSH) protocol or Windows Management Instrumentation (WMI) on the host system where the distributed node is running.
The computer system credentials for this host system are used to retrieve information to populate the host for the node and server configuration items on that system.
Solution
If you want this information to be populated, you must add the appropriate computer system credentials for the host system.

WebSphere sensor fails and the following message is displayed: CTJTD0692E

Problem
While attempting to discover a distributed WebSphere cell, the WebSphere sensor fails with the following message:
CTJTD0692E The distributed cell deployment manager bind address is not
found for the following cell:etabsap1TCell
Solution
Discoveries involving the sensors related to WebSphere Deployment Manager must have a working DNS. As a workaround, change com.collation.platform.os.disableRemoteHostDNSLookups to true, and ensure that the TADDM server always has the correct DNS search path.

WebSphere sensor fails and the following message is displayed: CTJTD3021E

Problem
The WebSphere sensor fails with the following message:
CTJTD3021E The sensor fails in a remote server :
 CTJTD2120E An error has occurred in the discovery process.:
CTJTD0775E A connection to the WebSphere server is not
 available: << ip address of IBM WebSphere application server  >>
 - ADMC0016E: The system cannot create a SOAP connector to connect to host 
<< ip address of IBM WebSphere application server >>
Solution
Verify that the problem is with the SSL support in the WebSphere client code. To verify, ensure that the WebSphere access list entry for this WebSphere Server is first in the access list (before any other WebSphere credentials). If the discovery is successful, import all the WebSphere certificates from the different servers into one truststore. Having multiple access list entries with different user IDs and passwords is acceptable. However, all the access list entries must specify the same truststore, which contains all the certificates.

For additional information, see Configuring the access list.

WebSphere JDBC driver sensor does not start

Problem
The WebSphere JDBC driver sensor does not start.
Solution
To establish why the WebSphere JDBC driver sensor does not start, ensure that the following conditions have been met:
  • A user profile for Level 3 discovery has been created and the WebSphere JDBC driver sensor is enabled.
  • Deep discovery is enabled for the WebSphere cell sensor.

WebSphere JDBC driver sensor cannot connect to the target host and the following message is displayed: CTJTD0796E

Problem
During the discovery, the WebSphere JDBC driver sensor cannot establish a connection with target host and the CTJTD0796E error message is displayed.
Solution
The following situations are possible reasons for this error:
  • SSH connection could not be established with the host.
  • A connection with the host was established, but the user did not have the appropriate privileges to run the WebSphere setupCmdLine script.
  • A connection with the host was established, but the user did not have the appropriate privileges to run the Java command.
You must check the sensor logs files to determine which of these situations has occurred.

If the sensor fails and the warning CTJTD0798W is displayed in the log files, ensure that the user specified in the WebSphere SSH access list entry has the appropriate privileges to run the WebSphere setupCmdLine script.

If the sensor fails and the warning CTJTD0799W is displayed in the log files, ensure that the user specified in the WebSphere SSH access list entry has the appropriate privileges to run the Java command.

Some JDBC dependencies are not created between a WebSphere server and database servers

Problem
TADDM discovers both the WebSphere server and a related database server but does not create a relation between them. Such a relation is based on the JDBC connection properties that are defined on the application server.
Solution
The problem might be a result of one of the following issues:
  • JDBC connectivity details are gathered by deep level discoveries only. Ensure that the discovery profile for the WebSphere sensor is configured for that level of discovery.
  • The dependencies are created by the JDBCDependencyAgent that runs in the Dependency topology agent group. Ensure that the agent is run after the discovery of the WebSphere servers.
  • The JDBCDependencyAgent processes only the recently discovered application servers. If some dependencies are still missing after the agent has run, rediscover the WebSphere servers, and wait for the topology agents to run again.
  • Ensure that the database server is one of those that supports the creation of transactional dependencies between it and the WebSphere application server. The following databases are supported:
    • Oracle
    • IBM DB2®
    • Microsoft SQL Server
    • Sybase

WebSphere sensor fails when the TADDM server is running Red Hat Enterprise Linux 6

Problem
WebSphere sensor fails when the TADDM server is running Red Hat Enterprise Linux 6. The following errors might be displayed:
CTJTD3021E The sensor fails in a remote server
CTJTD2015E There is a local anchor sensor failure
Solution
In the /etc/security/limits.d/90-nproc.conf configuration file, comment out the following line:
*          soft    nproc     1024

After you have updated the configuration file, you must restart the TADDM server.

Only placeholder objects are stored after a script-based discovery

Problem
After you run a successful WebSphereScriptSensor discovery, all stored objects are marked as placeholders and contain few details.
Solution
Placeholder objects are created by the WebSphereScriptSensor when a discovery target is a WebSphere Application Server node in a distributed cell other than the management cell (a DMGR's node). To obtain more detailed information about the placeholder model objects, run a discovery of the host with the DMGR.