Troubleshooting Service Composer issues
Some known issues related to Service Composer or Portal related issues in Cloud Automation Manager.
-
If the logged in user is not able to import services from GitLab and the following error occurs, then it indicates that the user does not have access to helm repositories:
"Activity_id: ibmchartc7e3fb29, The template ibm-nodejs-sample does not exist."As a resolution, grant access to Helm repository for the logged in user and then import service from GitHub/Gitlab/Bitbucket Server.
-
When you use special characters in JSON values that is used as an input parameter for a template, the provisioning fails with an error from IAAS.
For example:
"amis": { "today": "\"Thursday, January 24, 2019 11:01 AM\"" }As a resolution, remove the special characters and retry provisioning. The following example is cleared off special characters:
"amis": { "today": "Thursday, January 24, 2019 11:01 AM" } - For some reason, if a service instance is stuck in IN_PROGRESS state for very long time without any backend activity, then delete the service instance entry from the database. This situation might arise because of an unplanned restart of any of
the Cloud Automation Manager pods or any other related issues. Use Cleanup Service Instance in inprogress state API to delete the service instance. The API section provides you information about any
ResourceLockErrorthat might occur during the cleanup process. -
If you do not see any content on the left-hand side palette of the composer tab, then it is because of the connectivity issue between service composer and Business Process Manager or IBM Cloud Orchestrator or Helm plugins.
IAAS sample error message:
Bad return code from provider: { statusCode: 500, body: '{"error":{"statusCode":500,"message":"Internal Server Error"}}' }Business Process Manager Plugin/Provider Error
[2019-05-08T06:29:47.422] [ERROR] http_utils - { Error: connect ETIMEDOUT <IP_ADDRESS>:9443 at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1191:14) errno: 'ETIMEDOUT', code: 'ETIMEDOUT', syscall: 'connect', address: '<IP_ADDRESS>', port: 9443 } Error: Error stack Logged at: at Logger.(anonymous function) [as error]As a workaround, disable Business process Manager plugin that is causing the failure. For the actual procedure to disable a plugin, see Enabling and disabling Business Process Manager and IBM Cloud Orchestrator.
-
Even after the service is deployed and is in a running state, the IBM Multicloud Manager or IBM Kubernetes Services cluster does not show up in the Multi Cloud Manager controller's cluster list. It is possible that the service deployment had taken long enough that the Multi Cloud Manager controller token that is passed as a parameter on service deploy has been refreshed since the service deploy action. In this case, the IBM Multicloud Manager cluster is unable to register to the Multi Cloud Manager controller. To verify that this is the issue, check that the token on the Multi Cloud Manager controller instance has the same value as the one you have used when you deployed the Cloud Automation Manager Multi Cloud Manager service. As a workaround for this issue, try any of the following methods:
- Method 1:
- Redeploy the service.
- Ensure that the Multi Cloud Manager controller token stays unchanged for the time it takes for the service to deploy the new cluster.
- Attach the token to the controller.
- Estimate the duration of service deployment.
- Set the refresh of the controller token beyond that time interval.
-
Method 2
- Deploy the new cluster by using the Cloud Automation Manager IBM Multicloud Manager templates.
- Use Loads_Images_and_Deploys_Chart_on_ICP service to install the Multi Cloud Manager chart.
- Register the cluster with the controller. Use Services/MCM/MCM_on_ICP/MCM_Klusterlet-Loads_Images_and_Deploys_Chart_on_ICP/MCM_Klusterlet-Loads_Images_and_Deploys_Chart_on_ICP.json from Git repository
.
Using this two step cluster registration has the benefit that the long running step, which is the deployment of the cluster, gets separated from the Multi Cloud Manager chart install and connect operation. As the Multi Cloud Manager controller token is passed through during the service deploy, and this service deploy takes only minutes to deploy and run, it is very unlikely for the Multi Cloud Manager token to be invalidated during the service deployment. If the issue still persists, redeploy just the chart service using the new token, rather than redeploying the entire cluster again.
- Method 1:
-
When you keep multiple browser instances open for the same deployed instance (instance details or associated instance) that is in "wait" state, manually refresh the browser to see the latest updates.
-
Service may fail with unexpected EOF or may fail with both connection timeout and unexpected EOF. As a workaround, do the following steps:
- Ensure you have at least 3.1.2.0 ifix 1 installed or a newer version of Cloud Automation Manager. For instructions, see Install instructions
-
Set the
TERRAFORM_PARALLELISM_COUNTenvironment variable on the running cam-orchestration deployment of IBM Cloud Automation Manager:-
Using the user interface:
- Login to IBM Multicloud Manager user interface and go to Workloads > Deployments.
- Search for
cam-orchestrationdeployment. - On the right-hand side, select edit from the overflow menu.
-
Edit the deployment by adding the following element to the
env:array and then click Submit.{ "name": "TERRAFORM_PARALLELISM_COUNT", "value": "1" } -
To verify whether your deployment is successful, double click
cam-orchestrationdeployment link. - In the Overview tab > Pods section, check whether the
cam-orchestrationpod has restarted successfully.
-
Using the command line:
- SSH to the Cloud Automation Manager deployment machine.
-
Run the following command to edit the
cam-orchestrationdeployment:kubectl edit deployment cam-orchestration -n services -
Under the
- env:section, add the following lines.Note: Make sure you maintain the indentation same as other name-value pairs in this section.
name: TERRAFORM_PARALLELISM_COUNT value: "1" - Run the following command to ensure that the cam-orchestration pod has come up successfully after the change:
kubectl get pods -n services | grep cam-orch - Save these changes.
-
- Ensure you have at least 3.1.2.0 ifix 1 installed or a newer version of Cloud Automation Manager. For instructions, see Install instructions
- If you have service instances from versions prior to 3.1.2.0, then you cannot see the action details or logs of the previous attempts. However, you can fetch and view the details by using APIs. For more information about the API, see Service Instance GET API.
- If you have multiple service instances in parallel and a 401 issue occurs, then apply IBM Multicloud Manager fix to remove the limits for auth pods. For more information about this issue, see auth-idp pod restarts multiple times
.
- When you deploy a service that has non-permissible special characters in the value of any of its parameters, then the service instance creation fails with an error. For example, a service deployment fails because of a period (.) in the key value of a MongoDB parameter.
- Mongo does not allow you to create a map with a key value having dot in the name. To resolve this Mongo limitation, use a list with key names that have hardcoded text without a dot.
-
If you use the same bind secret across multiple provisioned instances, the first bind instance is created correctly. From the second instance onwards, though a bind instance gets created, the following error occurs in IBM Multicloud Manager user interface:
ServiceBroker returned failure; bind operation will not be retried: Status: 400; ErrorMessage: <nil>; Description: <nil>; ResponseError: <nil>; - When you create a service instance with bind action in IBM Multicloud Manager user interface, you cannot terminate the instance from IBM Multicloud Manager user interface without removing associated bind instances. Though you can terminate it from Cloud Automation Manager user interface, the bind action remains in active state.
- For IBM Cloud Kubernetes Service template in Service Composition, do not select or enter values for datacenter and other networking attributes whenever the value of the
machine_typeis "free". This selection is not supported because the datacenter parameter along with other networking attributes are ignored by the terraform provider for free clusters. - If you are logged into Cloud Automation Manager for a very long time, then its Access-Token on the browser expires. If you try to deploy a Service Instance after the Access-Token expires, the deployment fails with a 401 error. As a resolution,
do the following steps:
- Logout of Cloud Automation Manager.
- Relogin to Cloud Automation Manager.
- Retry service instance deployment.
- When you drag-drop helm charts in the composer, parameter details are not fetched properly. To resolve this error, synchronize the IBM Multicloud Manager Helm repositories in IBM Multicloud Manager user interface. For more information, see Managing Helm repositories.
- Whenever an email notification is sent, the SMTP user name value is also used as the sender email address value. However, sometimes when you use services like sendgrid, the SMTP user name is not a valid email address. Though send a test email
results in a message that says
Test email sent successfully., no email is actually sent or received. -
Terminate service instance may go to error state wherever the service instance has more than one Terraform template /Helm chart or is a combination of Terraform and Helm chart. A deployment failure in a template or helm chart can cause the entire terminate service instance to fail. However, the success of the destroy action of individual template or helm charts is based on their sequence in the composition. For example, consider the following behavior whenever a Helm chart and Terraform template are placed in sequence and a failure occurs in Helm chart deployment:
- If Terraform template is positioned after the Helm chart in the sequence, it gets destroyed but the termination fails at the Helm chart. The rest of the sequence is not tried and the service instance termination fails.
-
If the Helm chart is positioned after the Terraform template in the sequence, it is never attempted as the Helm failure occurs before the destroy action of Terraform template.
To resolve this issue, clean up Helm charts from backend. For the Terraform template, you can clean from backend or from the template tab.
-
The following limitation occurs after you upgrade to Cloud Automation Manager 2.1.0.2 or higher:
- If a service is created with plan parameters and tags, you cannot create a service instance from IBM Multicloud Manager user interface. Use Command Line option to create such service instances from IBM Multicloud Manager.
- Services and Service Instances created before Cloud Automation Manager 2.1.0.2 release does not work with Cloud Automation Manager 2.1.0.2. As a resolution, delete them first and create them newly in Cloud Automation Manager 2.1.0.2.
- Helm chart limitations:
- If you observe the following error message in the activity logs, then ensure the helm tiller has been configured on your IBM Kubernetes Service cluster:
Error: could not find tiller - Helm deployments assume that you have precreated the required persistent volumes, otherwise it uses dynamic provisioning (for example GlusterFS).
- When you click the link for helm release in the View a deployed service instance page, you are forwarded to the IBM Multicloud Manager management console, and not to the Deployed templates page in Cloud Automation Manager.
- Helm release deployments are not visible from the deployed templates list view in Cloud Automation Manager. They are only visible from the Service instance details page and IBM Multicloud Manager helm releases page.
- If you observe the following error message in the activity logs, then ensure the helm tiller has been configured on your IBM Kubernetes Service cluster:
-
When you rectify a invalid primary email server and redeploy a service that is already deployed, email notification may fail with an error message. An example error message is as follows:
serviceblueprint_email.emailnot379ed899: Creating... connection_used: "" => "" fail_quietly: "" => "true" message: "" => "Test" sender_name: "" => "CAMadmin" sequence_no: "" => "" status: "" => "" status_error_message: "" => "" status_message: "" => "" subject: "" => "Test" to.#: "" => "1" to.0: "" => "abc@in.ibm.com" Error applying plan: 1 error(s) occurred: serviceblueprint_email.emailnot379ed899: 1 error(s) occurred: serviceblueprint_email.emailnot379ed899: unexpected EOF Terraform does not automatically rollback in the face of errors .... ....Do the following steps to resolve the error:
- Log in to Cloud Automation Manager user interface.
- Delete primary and secondary SMTP connection.
- Open new tab in same browser without closing the browser.
-
Open any REST Client extension in the browser and call the following API:
URL : https://<CAM-IP>:30000/cam/connections/call_proxy?service=composer&httpmethod=POST&relative_url=smtpconnections Header: Content-Type : application/json Method: POST Payload:Payload For Primary connection:
{ "name": "PRIMARY", "description": "Primary SMTP Connection", "connection_parameters": [ { "name": "username", "value": "<primary@somesmtpserver.com>" }, { "name": "password", "value": "<password>" }, { "name": "hostname", "value": "<smtp.somesmtpserver.com>" }, { "name": "port", "value": "<587>" } ] }Payload For Secondary connection:
{ "name": "SECONDARY", "description": "Secondary SMTP Connection", "connection_parameters": [ { "name": "username", "value": "<secondary@somesmtpserver.com>" }, { "name": "password", "value": "<password>" }, { "name": "hostname", "value": "<smtp.somesmtpserver.com>" }, { "name": "port", "value": "<587>" } ] }
- If you remove the mapping of service parameters in the Composition tab > Parameters tab of the template activity, it does not delete it from the main Parameters tab > Service Parameters section of the service and is also available while you create a service instance. If you do not want to see it while you create a service instance, you must delete it exclusively from the main Parameters tab > Service Parameters section.
- There might be failures in the termination of a template instance in spite of a "Terminated" message in the user interface. After you terminate a service instance, check the template instance logs to reconfirm.
- Whenever an error occurs during category creation, check for API related errors in the log file.
- When you order a service, do not map the same parameter to two different decisions in the Additional Parameters page. This will cause both decisions to follow the same path.
- If you do not provide the instance name in the composition tab for a template, then an error message is displayed in the Source Code tab during publish. To avoid such errors, whenever you add a template in the composition tab, provide a value for the instance name.
-
Examine the following logs for Service Composer or Portal related issues:
- cam-orchestration
- cam-portal-api
- cam-iaas
-
If Cloud Automation Manager orchestration has multiple templates and a template fails in the workflow, then all subsequent templates are also not considered for deployment.