Recovering the Content Runtime

Learn how to recover your Managed services environment after losing the Content Runtime instance.

If the Content Runtime virtual machine is lost with no backup, there is a path to create a new Content Runtime and register any deployed instances to that Chef server again.

It is highly recommended that the Content Runtime virtual machine be backed up so that if it were ever destroyed, it could be restored without needing the following procedure.

Requirements

Get the following information:

  • The Content Runtime host name
  • The Content Runtime IP address
  • The private and public keys from the original Content Runtime
  • The network routing information from the original Content Runtime
  • The deploy logs from any instances that had been deployed to the Content Runtime

AWS and IBM Cloud

To perform a recovery on any of these two cloud providers, a portable IP address had to be initially associated with the Content Runtime virtual machine. If that's the case, then the following information will be required in addition when creating the new Content Runtime:

AWS

  • Private IP address

IBM Cloud

  • Private subnet
  • Private VLAN name
  • Portable IP address
  • Portable private netmask
  • Portable private gateway

It'll also be required to backup the Pattern Manager's configuration file. This file is base64 encoded and can be found at /opt/ibm/docker/pattern-manager/config/config.json

Recovery actions

The following steps are required to recover your Content Runtime and register your existing deployed instances to the new Chef server on the Content Runtime:

  1. Deploy new Content Runtime

    The first step in recovering your Chef server is to deploy a new Content Runtime instance. The new Content Runtime virtual machine must have the same host name and IP address as the old Content Runtime.

    Save the value used for the pattern manager access passphrase field. It is needed for making requests against the pattern manager.

    For more information on provisioning your Content Runtime, see Provisioning and managing your Content Runtime infrastructure.

  2. Populate software repositories

    Repopulate your software repository as described in:

  3. Optional: Restore your previous Chef server configuration

    If a previous Chef server configuration was backed up from the previous Content Runtime virtual machine, it can be restored at this point by using the chef-server-ctl restore command. For more information visit Backup and Restore Chef External link icon.

  4. AWS and IBM only: Restore the Pattern Manager configuration file

    Before copying the Pattern Manager's configuration back to the newly created Content Runtime virtual machine, there are some changes required.

    1. Open a terminal window on the location of the backed up config.json

    2. Run the following command replacing <previous_content_runtime_public_IP> with the previous Content Runtime's public IP address and <portable_or_private_IP> with the portable IP address (IBM) or the private IP address (AWS)

      base64 -d ./config.json | sed -e 's/<previous_content_runtime_public_IP>/<portable_or_private_IP>/g' | base64 > config.json.new
      

      This command will decode the config.json file, replace the previous VM's IP address with the portable IP address (which will match with the newly create Content Runtime) and encoded the resulting change into a file named config.json.new

    3. Establish an SSH connection with the new Content Runtime virtual machine.

    4. Replace the contents of the current Pattern Manager configuration file, located at /opt/ibm/docker/pattern-manager/config/config.json with the ones from the generated config.json.new.

    5. Restart the Pattern Manager docker image by navigating to the advanced-content-runtime folder and running podman-compose stop followed by podman-compose start.

  5. Re-register your virtual machines

    This step requires the Pattern Manager access passphrase that was specified when the Content Runtime was deployed.

    For each deployed instance:

    1. Log in to the Managed services web interface.
    2. Click Menu and click Deployed Instances > Terraform templates.
    3. Find your deployed instance in the list and click it.
    4. Click the Log File tab to bring up the log.

    There are three REST requests made to the pattern manager in this log that you must reissue to re-register your virtual machine. The REST requests can be made from any environment that has network access to the Content Runtime. Some values have been removed from the log file and must be re-entered. The following examples use the curl command to make these requests, but they can also be made with any standard HTTP request tool.

    1. Create vault entry

      The first request creates an empty vault entry for the stack ID of the previous deploy request. Search in the log file for a URL ending in /v1/vault_item/chef.

      Create vault entry
      Figure. Vault entry

      Copy the camc_endpoint and data properties and form a curl request. Note: If using IBM Cloud or AWS, instead of using the IP address provided in the camc_endpoint, use the "portable IP address" (IBM Cloud) or "private IP address" (AWS).

      curl -i -H "Authorization: Bearer <pattern_manager_access_token>" -d '{"vault_content":{"item":"secrets","values":{},"vault":"<vault-id>"}}' https://<camc_endpoint_ip>:5443/v1/vault_item/chef
      

      Note: If there is no VaultItem in the deployment's logs, the vault id and camc_endpoint_ip can be obtained from the bootstrap section in the logs, as:

      camc_bootstrap.LAMPNode01_chef_bootstrap_comp: Creating...
      access_token:    "" => "******"
      camc_endpoint:   "" => "https://<camc_endpoint_ip>:5443/v1/bootstrap/chef"
      data:            "" => "{\"environment_name\":\"_default\",\"host_ip\":\"host_ip_addr\",\"node_attributes\":{\"ibm_internal\":{\"stack_id\":\"stack_id\",\"stack_name\":\"\",\"vault\":{\"item\":\"secrets\",\"name\":\"<vault-id>\"}}},\"node_name\":\"node_name\",\"os_admin_user\":\"root\",\"stack_id\":\"stack\"}"
      

      The request should return a 200 response code and the vault_item is added to the pattern manager.

    2. Bootstrap virtual machine

      The next step is to bootstrap the virtual machine. A node on the Chef server for the virtual machine is created. Search in the log file for a URL ending in /v1/boostrap/chef.

      Bootstrap virtual machine
      Figure. Bootstrap virtual machine

      Copy the camc_endpoint and data properties and form a curl request. Note: If using IBM Cloud or AWS, instead of using the IP address provided in the camc_endpoint, use the "portable IP address" (IBM Cloud) or "private IP address" (AWS).

      curl -i -H "Authorization: Bearer <pattern_manager_access_token>" -d '{"environment_name":"_default","host_ip":"9.5.38.18","node_attributes":{"ibm_internal":{"stack_id":"77757fd10ef26711542b34f6ae5c368e","stack_name":"nschambu-mysql-01","vault":{"item":"secrets","name":"77757fd10ef26711542b34f6ae5c368e"}}},"node_name":"wild-nschambu-vm01","os_admin_user":"root","stack_id":"77757fd10ef26711542b34f6ae5c368e"}' https://9.5.38.16:5443/v1/bootstrap/chef
      

      This request should return a 200 response code and a message confirming that the bootstrap request was successful.

    3. Assign software

      The final step reassignes software to the endpoint. Search in the log file for a URL ending in /v1/software_deployment/chef.

      Software deploy
      Figure. Software deployment

      Copy the camc_endpoint and data properties and form a curl request, adding a new property called recovery_request to the JSON data. Note: If using IBM Cloud or AWS, instead of using the IP address provided in the camc_endpoint, use the "portable IP address" (IBM Cloud) or "private IP address" (AWS).

      curl -i -H "Authorization: Bearer <pattern_manager_access_token>" -d '{"environment_name":"_default","host_ip":"9.5.38.18","node_attributes":{"ibm":{"sw_repo":"https://9.5.38.16:9999","sw_repo_user":"repouser"},"ibm_internal":{"roles":"[oracle_mysql_base]"},"mysql":{"config":{"data_dir":"/var/lib/mysql","databases":{"database_1":{"database_name":"default_database","users":{"user_1":{"name":"defaultUser"},"user_2":{"name":"defaultUser2"}}}},"log_file":"/var/log/mysqld.log","port":"3306"},"install_from_repo":"true","os_users":{"daemon":{"gid":"mysql","home":"/home/mysql","ldap_user":"false","name":"mysql","shell":"/bin/bash"}},"version":"5.7.17"}},"node_name":"wild-nschambu-vm01","os_admin_user":"root","runlist":"role[oracle_mysql_base]","stack_id":"77757fd10ef26711542b34f6ae5c368e","vault_content":{"item":"secrets","values":{"ibm":{"sw_repo_password":"r3P0?@5s"},"mysql":{"config":{"databases":{"database_1":{"users":{"user_1":{"password":"aew2#4trNB"},"user_2":{"password":"dwr3%%nGKMd"}}}}},"root_password":"fqt@9dbeft"}},"vault":"77757fd10ef26711542b34f6ae5c368e"}, "recovery_request" : "True"}'
      

      This request should return a 200 response code and a message confirming that your Chef runlist is updated to the previous cookbooks and roles.

    Your deployed instances are now registered with the new Chef server. You are be able to interact with them in the Managed services web interface.

Troubleshooting

When executing the curl commands to re-register your virtual machine, a JSON object will be returned once the request has been executed. The following are examples of issues that might be found in this JSON message

HTTP/1.1 400 BAD REQUEST
Date: Mon, 11 Dec 2017 19:23:51 GMT
Server: Apache
x-pm-request-id: a5ba0df88f264cbaa940a0503a440b73
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Content-Length: 86
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: accept, authorization, content-type
Access-Control-Allow-Methods: GET, POST, PUT, DELETE
X-Frame-Options: SAMEORIGIN
Connection: close
Content-Type: application/json

{
  "message": "Vault e781c96d6a6b0f069b7e823877e045f3 item secrets already exists"
}

This error message indicates that the vault has already been created (step 1 on the Re-register your virtual machines has already been performed).

HTTP/1.1 400 BAD REQUEST
Date: Mon, 11 Dec 2017 19:19:58 GMT
Server: Apache
x-pm-request-id: a66f497afa8f4fe883eae340518ee2f5
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Content-Length: 80
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: accept, authorization, content-type
Access-Control-Allow-Methods: GET, POST, PUT, DELETE
X-Frame-Options: SAMEORIGIN
Connection: close
Content-Type: application/json

{
  "message": "vault_content was specified for a vault that does not exist."
}

The provided vault-id is either misspelled, or hasn't been created yet. Make sure that step 1 on the Re-register your virtual machine has been executed.