Recovering the Content Runtime
Learn how to recover your Managed services environment after losing the Content Runtime instance.
If the Content Runtime virtual machine is lost with no backup, there is a path to create a new Content Runtime and register any deployed instances to that Chef server again.
It is highly recommended that the Content Runtime virtual machine be backed up so that if it were ever destroyed, it could be restored without needing the following procedure.
Requirements
Get the following information:
- The Content Runtime host name
- The Content Runtime IP address
- The private and public keys from the original Content Runtime
- The network routing information from the original Content Runtime
- The deploy logs from any instances that had been deployed to the Content Runtime
AWS and IBM Cloud
To perform a recovery on any of these two cloud providers, a portable IP address had to be initially associated with the Content Runtime virtual machine. If that's the case, then the following information will be required in addition when creating the new Content Runtime:
AWS
- Private IP address
IBM Cloud
- Private subnet
- Private VLAN name
- Portable IP address
- Portable private netmask
- Portable private gateway
It'll also be required to backup the Pattern Manager's configuration file. This file is base64 encoded and can be found at /opt/ibm/docker/pattern-manager/config/config.json
Recovery actions
The following steps are required to recover your Content Runtime and register your existing deployed instances to the new Chef server on the Content Runtime:
-
Deploy new Content Runtime
The first step in recovering your Chef server is to deploy a new Content Runtime instance. The new Content Runtime virtual machine must have the same host name and IP address as the old Content Runtime.
Save the value used for the pattern manager access passphrase field. It is needed for making requests against the pattern manager.
For more information on provisioning your Content Runtime, see Provisioning and managing your Content Runtime infrastructure.
-
Populate software repositories
Repopulate your software repository as described in:
-
Optional: Restore your previous Chef server configuration
If a previous Chef server configuration was backed up from the previous Content Runtime virtual machine, it can be restored at this point by using the
chef-server-ctl restore
command. For more information visit Backup and Restore Chef.
-
AWS and IBM only: Restore the Pattern Manager configuration file
Before copying the Pattern Manager's configuration back to the newly created Content Runtime virtual machine, there are some changes required.
-
Open a terminal window on the location of the backed up
config.json
-
Run the following command replacing
<previous_content_runtime_public_IP>
with the previous Content Runtime's public IP address and<portable_or_private_IP>
with the portable IP address (IBM) or the private IP address (AWS)base64 -d ./config.json | sed -e 's/<previous_content_runtime_public_IP>/<portable_or_private_IP>/g' | base64 > config.json.new
This command will decode the
config.json
file, replace the previous VM's IP address with the portable IP address (which will match with the newly create Content Runtime) and encoded the resulting change into a file namedconfig.json.new
-
Establish an SSH connection with the new Content Runtime virtual machine.
-
Replace the contents of the current Pattern Manager configuration file, located at
/opt/ibm/docker/pattern-manager/config/config.json
with the ones from the generatedconfig.json.new
. -
Restart the Pattern Manager docker image by navigating to the
advanced-content-runtime
folder and runningpodman-compose stop
followed bypodman-compose start
.
-
-
Re-register your virtual machines
This step requires the Pattern Manager access passphrase that was specified when the Content Runtime was deployed.
For each deployed instance:
- Log in to the Managed services web interface.
- Click Menu and click Deployed Instances > Terraform templates.
- Find your deployed instance in the list and click it.
- Click the Log File tab to bring up the log.
There are three REST requests made to the pattern manager in this log that you must reissue to re-register your virtual machine. The REST requests can be made from any environment that has network access to the Content Runtime. Some values have been removed from the log file and must be re-entered. The following examples use the
curl
command to make these requests, but they can also be made with any standard HTTP request tool.-
Create vault entry
The first request creates an empty vault entry for the stack ID of the previous deploy request. Search in the log file for a URL ending in
/v1/vault_item/chef
.Figure. Vault entry Copy the
camc_endpoint
anddata
properties and form acurl
request. Note: If using IBM Cloud or AWS, instead of using the IP address provided in thecamc_endpoint
, use the "portable IP address" (IBM Cloud) or "private IP address" (AWS).curl -i -H "Authorization: Bearer <pattern_manager_access_token>" -d '{"vault_content":{"item":"secrets","values":{},"vault":"<vault-id>"}}' https://<camc_endpoint_ip>:5443/v1/vault_item/chef
Note: If there is no
VaultItem
in the deployment's logs, the vault id andcamc_endpoint_ip
can be obtained from thebootstrap
section in the logs, as:camc_bootstrap.LAMPNode01_chef_bootstrap_comp: Creating... access_token: "" => "******" camc_endpoint: "" => "https://<camc_endpoint_ip>:5443/v1/bootstrap/chef" data: "" => "{\"environment_name\":\"_default\",\"host_ip\":\"host_ip_addr\",\"node_attributes\":{\"ibm_internal\":{\"stack_id\":\"stack_id\",\"stack_name\":\"\",\"vault\":{\"item\":\"secrets\",\"name\":\"<vault-id>\"}}},\"node_name\":\"node_name\",\"os_admin_user\":\"root\",\"stack_id\":\"stack\"}"
The request should return a 200 response code and the
vault_item
is added to the pattern manager. -
Bootstrap virtual machine
The next step is to bootstrap the virtual machine. A node on the Chef server for the virtual machine is created. Search in the log file for a URL ending in
/v1/boostrap/chef
.Figure. Bootstrap virtual machine Copy the
camc_endpoint
anddata
properties and form acurl
request. Note: If using IBM Cloud or AWS, instead of using the IP address provided in thecamc_endpoint
, use the "portable IP address" (IBM Cloud) or "private IP address" (AWS).curl -i -H "Authorization: Bearer <pattern_manager_access_token>" -d '{"environment_name":"_default","host_ip":"9.5.38.18","node_attributes":{"ibm_internal":{"stack_id":"77757fd10ef26711542b34f6ae5c368e","stack_name":"nschambu-mysql-01","vault":{"item":"secrets","name":"77757fd10ef26711542b34f6ae5c368e"}}},"node_name":"wild-nschambu-vm01","os_admin_user":"root","stack_id":"77757fd10ef26711542b34f6ae5c368e"}' https://9.5.38.16:5443/v1/bootstrap/chef
This request should return a 200 response code and a message confirming that the bootstrap request was successful.
-
Assign software
The final step reassignes software to the endpoint. Search in the log file for a URL ending in
/v1/software_deployment/chef
.Figure. Software deployment Copy the
camc_endpoint
anddata
properties and form acurl
request, adding a new property calledrecovery_request
to the JSON data. Note: If using IBM Cloud or AWS, instead of using the IP address provided in thecamc_endpoint
, use the "portable IP address" (IBM Cloud) or "private IP address" (AWS).curl -i -H "Authorization: Bearer <pattern_manager_access_token>" -d '{"environment_name":"_default","host_ip":"9.5.38.18","node_attributes":{"ibm":{"sw_repo":"https://9.5.38.16:9999","sw_repo_user":"repouser"},"ibm_internal":{"roles":"[oracle_mysql_base]"},"mysql":{"config":{"data_dir":"/var/lib/mysql","databases":{"database_1":{"database_name":"default_database","users":{"user_1":{"name":"defaultUser"},"user_2":{"name":"defaultUser2"}}}},"log_file":"/var/log/mysqld.log","port":"3306"},"install_from_repo":"true","os_users":{"daemon":{"gid":"mysql","home":"/home/mysql","ldap_user":"false","name":"mysql","shell":"/bin/bash"}},"version":"5.7.17"}},"node_name":"wild-nschambu-vm01","os_admin_user":"root","runlist":"role[oracle_mysql_base]","stack_id":"77757fd10ef26711542b34f6ae5c368e","vault_content":{"item":"secrets","values":{"ibm":{"sw_repo_password":"r3P0?@5s"},"mysql":{"config":{"databases":{"database_1":{"users":{"user_1":{"password":"aew2#4trNB"},"user_2":{"password":"dwr3%%nGKMd"}}}}},"root_password":"fqt@9dbeft"}},"vault":"77757fd10ef26711542b34f6ae5c368e"}, "recovery_request" : "True"}'
This request should return a 200 response code and a message confirming that your Chef runlist is updated to the previous cookbooks and roles.
Your deployed instances are now registered with the new Chef server. You are be able to interact with them in the Managed services web interface.
Troubleshooting
When executing the curl commands to re-register your virtual machine, a JSON object will be returned once the request has been executed. The following are examples of issues that might be found in this JSON message
HTTP/1.1 400 BAD REQUEST
Date: Mon, 11 Dec 2017 19:23:51 GMT
Server: Apache
x-pm-request-id: a5ba0df88f264cbaa940a0503a440b73
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Content-Length: 86
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: accept, authorization, content-type
Access-Control-Allow-Methods: GET, POST, PUT, DELETE
X-Frame-Options: SAMEORIGIN
Connection: close
Content-Type: application/json
{
"message": "Vault e781c96d6a6b0f069b7e823877e045f3 item secrets already exists"
}
This error message indicates that the vault has already been created (step 1 on the Re-register your virtual machines has already been performed).
HTTP/1.1 400 BAD REQUEST
Date: Mon, 11 Dec 2017 19:19:58 GMT
Server: Apache
x-pm-request-id: a66f497afa8f4fe883eae340518ee2f5
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Content-Length: 80
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: accept, authorization, content-type
Access-Control-Allow-Methods: GET, POST, PUT, DELETE
X-Frame-Options: SAMEORIGIN
Connection: close
Content-Type: application/json
{
"message": "vault_content was specified for a vault that does not exist."
}
The provided vault-id
is either misspelled, or hasn't been created yet. Make sure that step 1 on the Re-register your virtual machine has been executed.