Other troubleshooting

See the following information to troubleshoot a variety of service issues, such proxy server buffering warning.

Crawlers fail to run, or are 'stuck'

If a crawler fails for any reason, for example during or after backup (or for any other reason), then its management artefact can be removed.

The following example shows an artefact in an ABORTED state; however, a crawler may also appear to be running and yet be stuck. A crawler can be said to be stuck when the number of items visited stops increasing.

{
      "keyIndexName": "crawler::RebroadcastCrawler",
      "_id": "dsG4CyQLQaWHk09TkDSZaA",
      "vertexType": "mgmtArtifact",
      "entityTypes": [
      "GRAPH_CRAWLER"
      ],
      "hasState": "ABORTED"
}
Workaround
Check the topology logs for more information on why the Crawler failed.
To re-run the crawler, delete the management artefact using the DELETE ​/mgmt_artifacts​/{id} API.
The ID of the ABORTED artefact in this example is dsG4CyQLQaWHk09TkDSZaA
Remember: The workaround applies to any crawlers that are stuck, even if they appear to be running.

Conflicting Docker network

A conflicting Docker network can cause issues with the internal Agile Service Manager communication.

Diagnose:
  1. Check current networks. The following command lists the current networks is use:
    docker network ls
  2. Confirm asm_default subnet:
    docker network inspect asm_default | grep -i subnet
  3. Check that any other networks do not use the same subnet:
    docker network inspect <NETWORK_NAME> | grep subnet
Solution:
Warning: Before you remove any clashing networks or interfaces, ensure that they are not required.
  • If the clashing network is not required, remove it:
    docker network rm <NETWORK_NAME>
  • If problems still occur. check ifconfig for clashing IP addresses. Remove unwanted network interfaces.
    ifconfig <network interface> down
  • If removing the docker network or clashing network interface isn't an option, you can configure the Agile Service Manager network to use a different subnet. Edit the <ASM_HOME>/docker-compose.yml file. Under the network section at the top of the file, modify the subnet value, then save the changes. Restart Agile Service Manager to use the new subnet.

Netcool/OMNIbus integration event dataset error

If scope-based grouping is not enabled in Web GUI, an event dataset error is displayed when using the Example_IBM_CloudAnalytics view in the Event Viewer.

Workaround: To resolve this issue, enable scope-based grouping. For more information, see the Installing scope-based event grouping topic in the Tivoli Netcool/OMNIbus documentation.

Scripts return HTTP error status 502 (bad gateway)

While running scripts in $ASM_HOME/bin an HTTP error status 502 (bad gateway) is returned.

One possible cause is the Nginx proxy running, but the service is not running, or has not yet started. You can verify that the service is running using the $ASM_HOME/bin/asm_status.sh script, and check the log file under $ASM_HOME/logs/ for continuous startup errors.

Another possible cause in some host configurations can be a localhost resolution issue.

Workaround: Configure the scripts to use an explicit hostname by setting the SERVICE_HOST environment variable before running them, for example:
export SERVICE_HOST='hostname'

Scripts return HTTP error status 000 (no response)

While running scripts in $ASM_HOME/bin an HTTP error status 000 (no response) is returned.

A possible cause is the Nginx proxy not running, or Agile Service Manager processes not running. You can verify that the service is running using the $ASM_HOME/bin/asm_status.sh script.

Workaround: Run the Agile Service Manager start command ($ASM_HOME/asm_start.sh). If the error recurs, check the logs in $ASM_HOME/logs/nginx/ for more information.

Certificate failure: 'sslv3 alert handshake failure'

Due to a certificate not being returned, certain services may encounter connectivity problems that are subsequently recorded in the logs as 'sslv3 alert handshake failure' error.

You may encounter a system warning, such as the following AWS Observer example: Warning The AWS Observer cannot be reached. The observer might be stopped or might have been uninstalled. In this state it is not possible to view or submit jobs.
Workaround
Restart your containers.
If the error persists, restart Agile Service Manager.

Proxy service buffering warning

If large information payloads are sent to the Nginx proxy server service, the error log may record the following warning: [warn]...a client request body is buffered to a temporary file...

Such warnings indicate that Nginx is temporarily storing the payload in storage as opposed to using memory. While this does not affect the performance of Agile Service Manager much, these messages could flood the log file, making other debugging tasks more difficult.

Workaround
To increase the limit at which Nginx uses memory rather than storage, open the $ASM_HOME/etc/nginx/conf.d/general.conf configuration file with a suitable text editor, and increase the value of the client_body_buffer_size parameter as required.
Restart the proxy service using the following command:
$ASM_HOME/bin/docker-compose restart proxy

Logrotate error

Logrotate creates an error.log file, and the Nginx in Docker lacks the write access to the file because of the ownership of the log file.

Workaround
Because Docker is running as UID 1000, and logrotate is running as another username, you need to add su netcoolasm noiadmin in the configuration file to ensure that Docker has write access to the new log file that logrotate generates.
  1. Create a logorate.d configuration file at /etc/logrotate.d/asm_proxy_1.
    /opt/ibm/netcool/asm/logs/nginx/*.log{
    su netcoolasm noiadmin
    daily
    rotate 10
    missingok
    notifempty
    sharedscripts
    postrotate
    ps -ef | grep "nginx: master" | grep -v grep | awk '{print $2}'  |  xargs kill -USR1
    endscript
    }
  2. At the beginning of the /etc/logrotate.conf file, add include /etc/logrotate.d.