Troubleshooting deployment scenarios

Control Center Director provides tools for troubleshooting and recovering from errors, such as log files. When you encounter an issue with your deployment log files should be the first place you look to assist you with the troubleshooting process.

The following table describes:

Relationship between common deployment scenarios and their associated log files
Common deployment scenarios-related FAQs

Table 1. FAQs and Log locations
Scenario/Question	Details
Gathering logging events	`runDataCollector` should be used to extract and report the current state of your Control Center Director deployment to help IBM Support troubleshoot an issue. It collects logs, configurations, system metrics, and performs backup of the database without interrupting its operations and stores the information in a .zip archive file. You can then send the archive file to IBM Support to help diagnose and fix problems.
Deployment-related log files	Log file is available at: ControlCenterDirector\log Look for file name with format: Engine_YYYYMMDD_TIMESTAMP
Control Center Director Engine (Hearbeat)-related log files	Log file is available at: ControlCenterDirector\log Look for `[NODENAME]INFO ServiceMonitor` for further details Alternatively, look into `CC_SERVER_COMPONENT` table updated with details ServerID with status UP
License Data-related log files	Look for `LicenseDataCollector` string in the log file for more details.
Customizing logging levels to suit logging requirements	Set logs levels to DEBUG to start recording logs and issue the `stopEngine/runEngine` utility for changes to come into effect. File path: Control Center Director Engine logs /Conf/enginelogger.xml Connect:Direct Agent logs CDInstallationDirectory/install/agent/bin/log4j2.xml
Agent state is down, whereas initial registration occurred via Agent.	Agent polling interval is 5 min. Invoke `pollAgent.sh` to validate if Agent is running on Connect:Direct Server. To troubleshoot look for `agent.log` file available at following log location: `\cdinstall_dir\install\logs`
How does agent communicate with Control Center Director?	At startup, agent posts an OSA to communicate that Agent is up. This helps Control Center Director in Server auto-discovery. During the upgrade process, agent sends OSA notifying Control Center Director about the upgrade status.
How does Agent distinguish between different upgrades it performs?	Agent receives a unique key, correlation-id with each upgrade and uses it to distinguish and also prefix to the log files generated with each upgrade.
Auto Discovery-related failure	Inspect `agent.log` for further diagnosis and to isolate the issue. One possible reason why Control Center Director could not discover Connect:Direct Server in your deployment is because it was not configured correctly. Other possible reason could be to do with updates made to the `osa.url` field that ensures Connect:Direct is auto discovered by Control Center Director. Agent provides a file polling mechanism that runs at a preconfigured polling interval (`cpiPollTime`) and detects any changes in `osa.url` field. If the `osa.url` field is modified during the polling interval the changes will only take effect at the end of the scheduled run. Another possible reason could be due to multiple Connect:Direct instances, you’re likely to run into port conflict issues unless you allocate a unique Agent listening port per instance. It is also recommended that having upgraded an instance, its unique port number must be applied before upgrading the next instance. This prevents potential errors that you could encounter during an upgrade process due to port conflict. One possible reason could be due to incorrect certificate-based configuration that is, either the Connect:Direct certificate is not trusted by Control Center Director or vice versa.
Will a bulk server upgrade operation, when stopped, bear an impact on servers upgrade for all including servers in the operation?	Control Center Director handles a bulk server upgrade process in batches. Batch upgrade groups servers together so that Control Center Director can execute upgrade operations in parallel. If an error occurs during the upgrade process Control Center Director will continue to process remaining upgrade operations in the batch. Batch size is set to 25 servers by default and can be modified by the Administrator. Edit the `batch-size` property in `DeploymentService.xml` available at: conf/services/system. `stopEngine/runEngine` utility for changes to come into effect.
Some servers under Failed Category in Job details view show up as Suspended by System.	This occurs is when a server was not upgraded due to multiple failure during a bulk upgrade process When an error occurs during bulk upgrade process, Control Center Director will continue to process remaining upgrade operations provided it meets the failure percentage threshold set. For example, for a batch size of 30 servers, if 15 jobs fail (50%) server status is displayed as `Suspended by System`. Failure percentage is set to 40% by default and can be modified by the Administrator. Edit the `failurePercentage` property in `DeploymentService.xml` available at:conf/services/system. `stopEngine/runEngine` utility for changes to come into effect.
Server upgrade Job scheduled returns the following error: `Job marked failed as OSA from Agent not received.`	Control Center Director is set to poll Agent and wait for any response for up to 3600 seconds (default) to verify if the upgrade request is being processed. When this threshold is exceeded the Job is marked as failed with error: `Job marked failed as OSA from Agent not received`. To modify the interval edit `lastOSARecivedTimeDifferenceInSecToMarkJobFailed` property in `DeploymentService.xml` available at:conf/services/system. `stopEngine/runEngine` utility for changes to come into effect.
Is Agent polling interval by ICC Director Engine configurable?	Control Center Director Engine is set to make 3 attempts every 60 seconds to poll Agent to verify Agent activity. To modify the polling interval and number of attempts edit the following parameters in `DeploymentService.xml` available at:conf/services/system: `AgentRestCallMaxNumberOfTime` `timeBeforeNextRestCallInSec` `stopEngine/runEngine` utility for changes to come into effect. Note: Configuring Agent polling interval only applies to servers that are configured to be auto-discovered by Control Center Director.