Cloud Foundry deployment fails on a specific virtual machine job

Cloud Foundry deployment fails on a specific virtual machine job.

Symptoms

When you deploy a Cloud Foundry application, an error similar to the following message is displayed:

Started updating job debian_nfs_server > debian_nfs_server/0 (c9a7c3c2-1d3b-41d2-a14c-f3c4f1ec742d) (canary)..... Done (00:01:13)
 Started updating job consul > consul/0 (7ade5ffb-a450-46e3-b94a-c1f8be310ef7) (canary)..... Done (00:01:08)
 Started updating job nats > nats/0 (76fe6c8e-1d21-4422-a407-8f79a78e03b7) (canary)...... Done (00:01:15)
 Started updating job ccdb_ng > ccdb_ng/0 (d4b559cb-b1e8-4b20-87de-37182e2b30ec) (canary)..... Done (00:01:10)
 Started updating job uaadb > uaadb/0 (e68e8783-049b-4f10-95c4-e922d2eccff6) (canary)..... Done (00:01:11)
 Started updating job router > router/0 (20d6e777-2ffe-42f7-8fbd-8b1675c5fa3c) (canary).... Done (00:00:47)
 Started updating job dea_next > dea_next/0 (75dae58e-de27-4f61-b396-1618fcc1ac04) (canary)..... Done (00:01:05)  Started updating job cc_core > cc_core/0 (87de372a-ac40-471c-b621-b4a6ac3ca602) (canary)........................................ Failed: 'cc_core/0 (87de372a-ac40-471c-b621-b4a6ac3ca602)' is not running after update. Review logs for failed jobs: cloud_controller_ng, cloud_controller_worker_local_1, cloud_controller_worker_local_2, nginx_cc, cloud_controller_migration, cloud_controller_worker_1, cloud_controller_clock (00:08:48)
Error 400007: 'cc_core/0 (87de372a-ac40-471c-b621-b4a6ac3ca602)' is not running after update. Review logs for failed jobs: cloud_controller_ng, cloud_controller_worker_local_1, cloud_controller_worker_local_2, nginx_cc, cloud_controller_migration, cloud_controller_worker_1, cloud_controller_clock
Task 9 error

In this message, the failed job is named cloud_controller_ng.

Resolving the problem

Log in to the BOSH director with the administrative credentials.
Ensure that your BOSH deployment is set. If it is not set, install the BOSH CLI.
Determine the ID of your cc_core. Run the following command:
```
bosh -e IBMCloudPrivate -d Bluemix vms
```

To determine which service failed, run the following commands:

bosh -e IBMCloudPrivate -d Bluemix ssh cc_core/<uuid>
sudo su
monit summary

The output resembles the following text

The Monit daemon 5.2.5 uptime: 27m

Process 'unbound'                   running
Process 'consul_agent'              running
Process 'cloud_controller_ng'       Execution failed
Process 'cloud_controller_worker_local_1' initializing
Process 'cloud_controller_worker_local_2' initializing
Process 'nginx_cc'                  initializing
Process 'cloud_controller_migration' Does not exist
Process 'cloud_controller_worker_1' Does not exist
File 'nfs_mounter'                  accessible
Process 'cloud_controller_clock'    Does not exist
Process 'statsd-injector'           running
Process 'route_registrar'           running
Process 'loginserver'               running
Process 'uaa'                       running
Process 'mod_vms'                   running
Process 'metron_agent'              running
System 'system_localhost'           running

In this example, the cloud_controller_ng process failed.

Locate the directory for the failed process:

cd /var/vcap/sys/log/<failed_process> ; ls

Where <failed_process> is the name of the process that failed. The output resembles the following code:

-rw-r--r--  1 root root 20250 Aug 22 17:30 cloud_controller_ng_ctl.err.log
-rw-r--r--  1 root root 24159 Aug 22 17:30 cloud_controller_ng_ctl.log
-rw-r--r--  1 root root 33988 Aug 22 17:30 cloud_controller_worker_ctl.err.log
-rw-r--r--  1 root root 21678 Aug 22 17:30 cloud_controller_worker_ctl.log
-rw-r-----  1 root root   506 Aug 22 17:00 pre-start.stderr.log
-rw-r-----  1 root root     0 Aug 22 17:00 pre-start.stdout.log

Review each log in this directory until you find the error.
Correct the error.