Cloud Foundry deployment fails on a specific virtual machine job
Cloud Foundry deployment fails on a specific virtual machine job.
Symptoms
When you deploy a Cloud Foundry application, an error similar to the following message is displayed:
Started updating job debian_nfs_server > debian_nfs_server/0 (c9a7c3c2-1d3b-41d2-a14c-f3c4f1ec742d) (canary)..... Done (00:01:13)
Started updating job consul > consul/0 (7ade5ffb-a450-46e3-b94a-c1f8be310ef7) (canary)..... Done (00:01:08)
Started updating job nats > nats/0 (76fe6c8e-1d21-4422-a407-8f79a78e03b7) (canary)...... Done (00:01:15)
Started updating job ccdb_ng > ccdb_ng/0 (d4b559cb-b1e8-4b20-87de-37182e2b30ec) (canary)..... Done (00:01:10)
Started updating job uaadb > uaadb/0 (e68e8783-049b-4f10-95c4-e922d2eccff6) (canary)..... Done (00:01:11)
Started updating job router > router/0 (20d6e777-2ffe-42f7-8fbd-8b1675c5fa3c) (canary).... Done (00:00:47)
Started updating job dea_next > dea_next/0 (75dae58e-de27-4f61-b396-1618fcc1ac04) (canary)..... Done (00:01:05) Started updating job cc_core > cc_core/0 (87de372a-ac40-471c-b621-b4a6ac3ca602) (canary)........................................ Failed: 'cc_core/0 (87de372a-ac40-471c-b621-b4a6ac3ca602)' is not running after update. Review logs for failed jobs: cloud_controller_ng, cloud_controller_worker_local_1, cloud_controller_worker_local_2, nginx_cc, cloud_controller_migration, cloud_controller_worker_1, cloud_controller_clock (00:08:48)
Error 400007: 'cc_core/0 (87de372a-ac40-471c-b621-b4a6ac3ca602)' is not running after update. Review logs for failed jobs: cloud_controller_ng, cloud_controller_worker_local_1, cloud_controller_worker_local_2, nginx_cc, cloud_controller_migration, cloud_controller_worker_1, cloud_controller_clock
Task 9 error
In this message, the failed job is named cloud_controller_ng.
Resolving the problem
- Log in to the BOSH director with the administrative credentials.
- Ensure that your BOSH deployment is set. If it is not set, install the BOSH CLI.
-
Determine the ID of your
cc_core. Run the following command:bosh -e IBMCloudPrivate -d Bluemix vms -
To determine which service failed, run the following commands:
bosh -e IBMCloudPrivate -d Bluemix ssh cc_core/<uuid> sudo su monit summaryThe output resembles the following text
The Monit daemon 5.2.5 uptime: 27m Process 'unbound' running Process 'consul_agent' running Process 'cloud_controller_ng' Execution failed Process 'cloud_controller_worker_local_1' initializing Process 'cloud_controller_worker_local_2' initializing Process 'nginx_cc' initializing Process 'cloud_controller_migration' Does not exist Process 'cloud_controller_worker_1' Does not exist File 'nfs_mounter' accessible Process 'cloud_controller_clock' Does not exist Process 'statsd-injector' running Process 'route_registrar' running Process 'loginserver' running Process 'uaa' running Process 'mod_vms' running Process 'metron_agent' running System 'system_localhost' runningIn this example, the cloud_controller_ng process failed.
-
Locate the directory for the failed process:
cd /var/vcap/sys/log/<failed_process> ; lsWhere
<failed_process>is the name of the process that failed. The output resembles the following code:-rw-r--r-- 1 root root 20250 Aug 22 17:30 cloud_controller_ng_ctl.err.log -rw-r--r-- 1 root root 24159 Aug 22 17:30 cloud_controller_ng_ctl.log -rw-r--r-- 1 root root 33988 Aug 22 17:30 cloud_controller_worker_ctl.err.log -rw-r--r-- 1 root root 21678 Aug 22 17:30 cloud_controller_worker_ctl.log -rw-r----- 1 root root 506 Aug 22 17:00 pre-start.stderr.log -rw-r----- 1 root root 0 Aug 22 17:00 pre-start.stdout.log -
Review each log in this directory until you find the error.
- Correct the error.