Technical Blog Post
Abstract
ITM Agent Insights: VMWare VI agent connectivity debugging
Body
This blog post is to provide generic debugging steps to troubleshoot connectivity issues to a defined data source for ITM VMware VI monitoring agent.
The symptoms that are often reported that are attributable to a connection issue between the VM agent's data provider and a data source (vSphere, vCenter, ESX, ESXi, vCSA) are:
- No navigator items in Tivoli Enterprise Portal for ESX servers that should be subnodes of the main VM agent instance.
- VM agent subnodes for ESX servers are off-line / not managed.
- Off-line subnodes for VM agent in "tacmd listSystems".
- No data displayed in TEP workspaces for VM agent - blank workspaces.
- Error messages in data provider log indicating connection problems.
Example: SEVERE: DataSource.start: Failed to connect to: <data source IP or DNS name or FQDN>
The most common causes for this problem are:
1. Invalid / missing SSL certificate
2. For 7.1 VMWare VI agent customer may be attempting to connect to vCenter using a hostname. Try using the numeric IP address as workaround.
3. Incorrect account ID and or password specified to connect to the data source
4. For vCenter 6.0 and higher, it may be a problem where TLS protocol is not supported by the ITM VI monitoring agent until 7.2 FP2 IF3
5. Network connectivity issue / firewall blocking communication
6. VM agent Java based data provider cores with OutOfMemoryError due to exceeding Java heap size in larger environments
Confirm whether the main VM agent instance is able to communicate with Tivoli Monitoring infrastructure and is online to the Tivoli Enterprise Management Server (TEMS).
If the main VM agent instance is on-line, this rules out a general communication issue between the system where the VM agent is running and the Tivoli Monitoring infrastructure.
Example from "tacmd listSystems -t VM":
TESTINST:<hostname>:VM VM 07.20.02.00 Y
VM:TESTINST-esxServer01:ESX VM 07.20.02.XX N
VM:TESTINST-esxServer02:ESX VM 07.20.02.XX N
VM:TESTINST-esxServer03:ESX VM 07.20.02.XX N
If the main VM agent instance is on-line, but subnodes for ESX systems are off-line, review the java data provider log for error messages.
With default level "INFO" level, messages may be seen indicating that attribute groups "does not exist" and that ESX systems are "not managed by this agent"
Example:
cps_socket.cpp,1602,"collectData") Requested attribute group 'Monitored_Servers' does not exist
cps_socket.cpp,1602,"collectData") Requested attribute group 'Datastores' does not exist
cps_socket.cpp,1602,"collectData") Requested attribute group 'vCenters' does not exist
genericagent.cpp,591,"GenericAgent::tsGetQueryList") Subnode VM:<instance>-<subnode system>:ESX is not managed by this agent.
The failure of ESX systems to be managed as subnodes is often caused by SSL communication issues, particularly SSL certificate validation.
As a diagnostic step, it is often useful to perform the following steps to test if there is a general SSL communication problem, or simply a SSL certificate validation problem.
To rule out invalid SSL certificates, verify if subnodes are on-line and data is able to be gathered and populate workspaces in the TEP configuring the VM agent instance to NOT validate SSL certificates.
Gather output from pdcollect utilty and review the VM agent instance's config file:
<hostname>_vm_<instance>.cfg
Example:
INSTANCE=testVMinstance [
SECTION=DATA_PROVIDER
[ { KVM_LOG_FILE_MAX_COUNT=10 }
{ KVM_LOG_FILE_MAX_SIZE=5190 }
{ KVM_SSL_VALIDATE_CERTIFICATES= Yes }
{ KVM_LOG_LEVEL=INFO } ]
SECTION=DIRECTOR
[ { KVM_DIRECTOR_AUTHENTICATION=Yes }
{ KVM_DIRECTOR_PORT_NUMBER=8422 } ]
SECTION=STORAGE_AGENT [ ]
SECTION=DATASOURCE:vcenter1
[ { HOST_ADDRESS=123.123.123.123 }
{ PASSWORD=\{AES256:keyfile:a\}abidBLbpEewK1kIJFXDc5A\=\= }
{ USERNAME=asset }
{ USES_SSL=Yes } ]]
To rule out problems with SSL certificate validation being the cause for offline subnodes or no data in TEP workspaces, set "KVM_SSL_VALIDATE_CERTIFICATES=No".
This can be done by reconfiguring the VM agent instance with "itmcmd config", manually updating the instance's .cfg file and recycling the agent, or through MTEMS GUI:
Change the "Validate SSL Certificates" value to No.
Setting "USES_SSL" to Yes under the "Data Source" tab allows SSL communication to be used between the data provider for the VI agent and the data source (vCenter / ESX server) without requiring SSL certificate validation.
Setting "USES_SSL" to No under the "Data Source" tab completely disables SSL communication.
Enabling SSL communication with VMware VI data sources
https://www.ibm.com/support/knowledgecenter/SS9U76_7.2.0/com.ibm.tivoli.itmvs.doc_7.2/vmware/enablingssl.html
Confirm whether the vCenter / vSphere / ESX servers are configured for SSL communication.
This may depend on the release of vCenter / vSphere / ESX servers used. Refer to VMware documentation for configuration of SSL communication.
Confirm the accessMode for the vCenter / vSphere / ESX environment, and see if it allows for both SSL and non-SSL communication, only SSL communication, or only non-SSL communication.
If it is set to "httpAndHttps" this should allow for both types of communication.
In response to POODLE and other security concerns with SSLv3, VMware has made changes to disable SSLv3 and default to TLS protocol.
This was done in various levels, most of which are vCenter 6.0 levels.
VMware vCenter Server 6.0 Update 1 Release Notes
http://pubs.vmware.com/Release_Notes/en/vsphere/60/vsphere-vcenter-server-60u1-release-notes.html
Support for SSLv3: Support for SSLv3 has been disabled by default.
From VMware knowledge article:
Disabling SSLv3 on vCenter Single Sign-On port 7444 (2131310)
https://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=2131310&sliceId=1&docTypeID=DT_KB_1_1&dialogID=710756846&stateId=1%200%20710766174
Check the server.xml file for the vCenter / ESX data sources and confirm if it is restricted to only allowing TLS connections for SSL?
- Windows default location: C:\ProgramData\VMware\vCenterServer\runtime\VMwareSTSService\conf\
- vCenter Server Appliance default location: /usr/lib/vmware-sso/vmware-sts/conf/
Find the following line:
'<Connector SSLEnabled="true"'
Confirm if there specific protocols enabled on this line with the "sslEnabledProtocols" parameter?
Example:
'sslEnabledProtocols="TLSv1,TLSv1.1,TLSv1.2"'
Not only did VMware change from SSLv# to TLS protocol, VMware also changed the default protocols allowed.
If "Https" is not allowed, configure the monitoring agent and set "USES_SSL=No".
If "Https" is required, configure the monitoring agent and set "USES_SSL=Yes".
If the data source version requires use of TLS protocols for SSL communication, the minimum level of VM agent that allows for this protocol is:
IBM Tivoli Monitoring for Virtual Environments: VMware VI 7.2.0.2-TIV-ITM_VMWVI-IF0003
If the environment was working previously, and then additional ESX servers were added, this might go over the threshold where it is necessary to increase the Java heap size.
VMware VI agent connects to vCenter server to monitor the VMware virtual infrastructure and the JAVA heap size requirement of the VMware VI agent depends on the size of the environment that is being monitored.
Increasing the Java heap size
https://www.ibm.com/support/knowledgecenter/en/SS9U76_7.2.0.3/com.ibm.tivoli.itmvs.doc/vmware/javaheap.html
...
It is important to verify connectivity is possible outside of the VM monitring agent using vSphere client or MOB URL in a browser since VM agent data provider logging will only show that the connecition can't be established, but can't determine the root cause for a connection issue.
Review the data provider log for errors - kvm_data_provider_<instance>_startup.log
User ID / Password / Certificate issue:
SEVERE: Connection.open: Bad username or password for user: XYZ
WARNING: Connection.open: Caught an exception while cleaning up a failed connection attempt.
SEVERE: DataSource$ConnectionOpener.call: Failed to connect to: 123.123.123.123
INFO: DataSource.start: Unable to connect to the data source :123.123.123.123
INFO: DataSource.start: Cleaned up all the connection resources
SEVERE: DataProvider.addDataSource: Initial attempt to connect to 123.123.123.123 com.ibm.tivoli.monitoring.agent.kvm.vmware.DataSource@8da49733 failed.
SSL protocol issue:
INFO: Connection.open: Using jar local WSDL files.
FINEST: NotificationCenter.postNotification: Received a notification to post: Connection.SSL_NEGOTIATION_FAILED com.ibm.tivoli.monitoring.agent.kvm.vmware.Connection@3a3a3a3a
FINEST: NotificationCenter.postNotification: Received a notification to post: DataProvider.EventReceived com.ibm.tivoli.monitoring.agent.kvm.itm.DataProvider@39013901
SEVERE: Connection.open: SSL handshake failed to: <data source>
SEVERE: DataSource.start: Failed to connect to: <data source>
WARNING: DataProvider.run: Failed to start datasource 123.123.123.123 com.ibm.tivoli.monitoring.agent.kvm.vmware.DataSource@3c893c89
Using vSphere client:
Confirming the connection by using the "VMware vSphere client" running on the same system as the ITM monitoring agent requires to NOT select "use windows session credentials" check box, so it must be "unchecked".
Provide the vCenterIPAddress or HostName of FQDN of the vCenter in the VMware vSphere client that was specified when configuring the Datasource with the ITM VMware VI monitoring agent.
If the VMware vSphere Client fails the connection, review the type of failure indicated.
Examples showing "network" issues like firewalls blocking communication or a network "timeout":

A failure message that explicitly indicates "Cannot complete login due to an incorrect user name or password" indicates communication with the vCenter, but authentication has failed due to specifying a bad User name or password:
If the connection succeeds to the vCenter through the VMware vSphere Client when running the VMware vSphere Client on the same system as the ITM VMware VI monitoring agent, use the same exact values for the ITM agent's data provider when configuring the Datasource.
If using a numeric IP address for the vCenterIPAddress, use that when configuring ITM VM agent instance.
If using a Hostname value for the vCenterIPAddress, use that when configuring ITM VM agent instance.
If using a FQDN value for the vCenterIPAddress, use that when configuring the ITM VM agent instance.
AND specify the same User name, and same Password values that allowed the connection in VMware vSphere Client.
Using MOB URL:
This is the preferred method of confirming connectivity is possible since the VM monitoring agent relies on the same API calls that the MOB URL utilizes to establish a connection.
MOB is applicable to vCenters as well as ESX servers, and can use HostName/numeric IP/FQDN - whichever works to access the MOB, the same can be used to configure the VM monitoring agent data source.
From a browser on the system where the VM agent is installed:
1. Launch a browser on host where the VM monitoring agent is installed
2. Access the link https://vcenterIP/mob/ , where vcenterIP is the IP of vCenter / ESX host, or hostname / FQDN of vCenter / ESX host
3. When prompted for account and password, input the provided credential (username/password) that is used to configure the VM monitoring agent
4. Confirm whether the provided credentials can pass the authentication
If using MOB URL to confirm connectivity with USES_SSL=No, confirm that the MOB URL works for "http:".
If using MOB URL to confirm connectivity with USES_SSL=Yes, confirm that the MOB URL works for "https:"
If the MOB URL fails to load indicating 501 error, this indicates a network issue:
Being prompted with the authentication panel indicates successful network connectivity to the vCenter / ESX server.
Not being prompted indicates a lack of response from the vCenter / ESX server and could indicate firewalls blocking the return communciation.
When asked to provide credentials to connect, enter the same User ID and password used when configuring the datasource for VM monitoring agent.
If valid datasource credentials are provided, the MOB URL will present a screen similar to:
If the credentials provided are invalid, the URL for MOB will NOT show you that the entered credentials are invalid.
It will simply not present the page above.
When testing connectivity to a datasource using MOB URL, it is important to confirm the SSL protocols that the browser is allowing.
Disabling Browser Support for the SSL 3.0 Protocol
https://www.digicert.com/ssl-support/disabling-browser-support-ssl-v3.htm
For example, using Internet Explorer, to confirm the protocols the browser is allowing:
Internet Options - Advanced tab, scroll through the settings and checking what is enabled under the "Security" section:


When testing using the MOB URL to confirm authentication for ITM VM agents lower than 7.2.0.2-TIV-ITM_VMWVI-IF0003, change the browser to ONLY allow SSL protocols:
Make sure that the TLS protocols are unchecked.
Then confirm the MOB URL allows authentication to the VM data source with HTTPS in the browser.
If the MOB URL fails with HTTPS when TLS is disabled, but works when TLS is enabled, the minimum level of VMware VI monitorng agent that supports TLS protocols must be applied to allow connection to the data source - 7.2.0.2-TIV-ITM_VMWVI-IF0003.
If connection to the datasource outside of ITM VMware VI monitoring agent can not be done, work with the network / VMware administrators to resolve connectivity issues external to the Tivoli Monitoring Agent.
Submitter: drd401709
Compid: 5724L92VI
Reference DCF technotes:
DCF 1679257 - Tivoli Monitoring for VMware agents are offline.
DCF 1980856 - ITM Monitoring Agent for VMware VI agent connectivity to data source
DCF 1395250 - Configuration non-SSL communication
DCF 1974267 - Support of VMWare VI agent for TLS 1.2
DCF 1960107 - Configuration recommendations for VMware VI agent
Keywords: kvmagent vCenter Server Appliance KVM_CUST
UID
ibm11277962




