MDM SE Connector Debug Reference
This section describes connector debugging techniques and methods that can help resolve common connector problems.
- Validate user and password
- Often simple connector issues can be attributed to basic account permission errors. Confirm that you are using an account with the appropriate permissions to crawl your repository and that you are using the right password for it.
- Check the documentation
- Be sure that you have correctly configured all installed Watson™ Explorer components
and your connector, and that there are no missing steps, or incorrect configuration settings, which
might be causing a problem in using the connector. Tip: Rights functions for user collections are common connector pitfalls.
- Eliminate resource-side errors
- It is a good tactical step to "assume" the issue is with a Watson Explorer Engine connector but, at the same time, to make the administrator of the resource that you are crawling aware of any problems crawling that resource. The administrator may be aware of the issue and have a patch available. It never hurts to check.
- Test multi-threaded versus single-threaded
- To determine if a connector issue is related to multithreading, set the thread count to 1 and then test a new crawl. If an error is encountered, multithreading is not the source of the problem. Setting the thread count to 1 also has the benefit of making the log easier to read.
- Enable bootstrap logging
- If a connector is not starting at all, enable bootstrap logging to determine where the failure
occurs when the connector is initiated. Bootstrap logging can be enabled in the Watson Explorer Engine
administration tool's seed configuration screen.
To activate bootstrap logging do the following:
- From the seed configuration page of your site collection, go to . The crawling configuration page displays.
- In the Seeds section, click edit and expand the Advanced - Logging collapsible menu.
- Check the enable connector bootstrap logging box. Additionally, enter Log4j settings in the Connector Logging Configuration text box.
- Click OK.
- Enable connector logging
- If Bootstrap Logging is not available, you can enable a logging
condition. To add a logging condition to the connector seed, do the following:
- In the Watson Explorer Engine
administration tool, select Add A New Condition from the section.
A pop-up window displays with a list of new conditions.
- Scroll down and select connector logging.
Your goal is to capture a stack trace, which can help pinpoint what might be causing your connector problems.
- In the Watson Explorer Engine administration tool, select Add A New Condition from the section.
- Enable Log4J logging levels
- Log4j enables you to activate different levels of logging without modifying the application
binary thus avoiding a heavy performance cost. Logging behavior can be controlled by editing a Log4J
Key logging levels that can be applied using the Log4j utility are the following:
OFFlevel has the highest possible rank and is intended to turn off logging.
FATALlevel designates very severe error events that will presumably lead the application to abort.
ERRORlevel designates error events that might still allow the application to continue running.
WARNlevel designates potentially harmful situations.
INFOlevel designates informational messages that highlight the progress of the application at coarse-grained level.
DEBUGLevel designates fine-grained informational events that are most useful to debug an application.
TRACELevel designates finer-grained informational events than the DEBUG
ALLhas the lowest possible rank and is intended to turn on all logging.
For more detailed information about Log4j and its configuration, see the online resources for Log4j.
- Enable Oakland HTTP wire logging
- Enabling logging for wire-level activity is useful for Watson Explorer Engine connectors
that use HTTP connections. This is because the wire log records all data transmitted to and from
your server(s) when executing HTTP requests. The wire log uses the
org.apache.http.wirelogging category, which should only be enabled to debug problems. Be aware that wire logging will produce a large amount of log data.
- Check for missing JAR files
- Be sure that you have all the JAR files needed. If the connector was installed correctly, the necessary JAR files should have been copied to the right location by default.
- Open JMX port to profile resources
- Java Management Extensions (JMX) supply tools for managing and monitoring applications, system
objects, devices and service oriented networks.
To enable the JMX agent and configure its operation, you must set certain system properties when you start the Java virtual machine (JVM). For detailed instruction, consult help resources for using JMX and other JMX compliant tools.
- Packet trace with Wireshark
- If you are familiar with Wireshark and its advanced packet trace capabilities, it can be used instead of, or to augment, any packet tracing capabilities in the connector that you are using. Consult your Wireshark help resources for using the more powerful features of Wireshark tracing.
- Profile resources
- Use common performance testing methods to determine how fast the connector performs under a particular workload. Profiling the resources used under various work loads serve to pinpoint bugs relating to scalability, reliability, and resource usage.
- Replicate in development environment
- Replicate the production environment issue in your development environment and test for the same bug.
- Reproduce without connector
- Another simple test to determine if the connector is the source of the error, is to attempt to
probe the remote resource without it. If you are unable to contact the remote resource without the
connector, there may be a problem with your environment rather than with the connector. Common tools
used to help in this regard include the following:
- Curl is a command line tool for sending and receiving files using URL syntax. Since Curl is used by many Watson Explorer Engine connectors, it is a great tool to help pinpoint the source of problems when crawling associated resource sites.
- Check that your problems are not browser specific. To do so, attempt to display search results in modern browsers such as Firefox, Internet Explorer, Chrome, and Safari. Test in the browser versions that are relevant to your users.
- Ping and Traceroute can be used to send packets of information to the remote data resource for the purpose of retrieving information, which can useful for testing your internet connection. Consult your operating system documentation on how to locate and execute the ping and traceroute utilities that are available in your environment.
- Adjust crawler delay
- In Delay value to 1. This will
increase requests on your server to help identify potential problems. Note: We do not recommend setting the delay to 0. Doing so can cause excessive resource usage on your crawling server, repository server, or both.
, set the
- Validate web services
- Check that all web services are performing correctly and that all the needed web services are activated in the server(s) where the data you are crawling is hosted. You can use a Web test to test Web services. Check online resources for writing specific web tests based on your environment.