Frequently asked questions (FAQs) about on-premises containers

Find answers to frequently asked questions that are related to IBM® Sterling Intelligent Promising containers, IBM OMS Gateway Operator, IBM Sterling Intelligent Promising Operator, truststores, external services, and more.

Operators

Can I install IBM Sterling Intelligent Promising on any Kubernetes cluster, for example Red Hat® OpenShift® Container Platform, vanilla Kubernetes, or cloud-managed clusters such as EKS, GKE, or AKS?: Yes, IBM Sterling Intelligent Promising is designed to be platform-agnostic and you can install it on any Cloud Native Computing Foundation (CNCF) Kubernetes cluster.

I am working with a Kubernetes or Red Hat OpenShift Container Platform cluster in a restricted environment where internet access is not available and the Operator Lifecycle Manager (OLM) cannot be used. Is there a support for an airgap setup? If yes how do I install IBM Sterling Intelligent Promising?: Yes, airgap setup is supported for IBM Sterling Intelligent Promising. For more information, see Operator Utility overview.

The Sterling Intelligent Promising on-premises container deployment involves two Operators, IBM Sterling Intelligent Promising Operator and IBM OMS Gateway Operator. Is it possible and advisable to install both in two different namespaces?: No. It is not advisable to install both the Operators in two different namespaces. For more information, see Installing the Operators by using command-line interface (CLI).

I am trying to upgrade the IBM Sterling Intelligent Promising Operator and IBM OMS Gateway Operator version from v1 to v2, but I noticed that no installation plan is created for the corresponding subscription of version v2. Under normal circumstances, a new installation plan is generated automatically when an updated Operator version is available. What might be the reasons for the installation plan not being created in this scenario?

This issue can occur when a monitoring, security, or service mesh tool

injects an init container or
modifies the behavior of pods in the cluster including the ones managed by Operator Lifecycle Manager (OLM), such as the catalog source pods.

If such an injection occurs, it can interfere with the ability of catalog Operator to detect updates to the Operator images. Hence, preventing the automatic creation of an installation plan when a new version becomes available.

To resolve this,

Ensure that no third-party tools are modifying or injecting init containers into catalog source pods.
After the init container is removed or the injection is disabled, the OLM detects the updated Operator version and generates the installation plan.

While you install an Operator in the Red Hat OpenShift Container Platform cluster, I noticed that the Operator Hub is listing multiple versions, including older ones alongside the latest release. In this scenario, is it possible to install a specific older version of the Operator instead of the latest one? If so, what is the recommended approach to help ensure that the correct version is installed and managed properly? And also what is the way to upgrade those Operators?: Yes, you can choose to install a specific version of the Operator if needed. However, a best practice is to use a latest available version to take advantage of the most recent features, improvements, and enhancements. Staying up to date ensures the best compatibility and support experience. For more information about the latest version, see On-premises containers enhancement section for latest release in what's new.

I have configured the Operator upgrade policy to Automatic, so the Operator gets updated to the latest version whenever a new release is available. However, I noticed that while the Operator version upgrades automatically, the applications such as Inventory, Promising, Utility and Optimization are still using the old image versions. In this scenario, is this a potential problem? How can this problem be managed effectively?: No. Backward compatibility to some releases is supported. However, it is recommend using the compatible images with Operator. For more information, see List of Operator versions and application image tags.

Truststores and security

I have multiple truststores for middleware services such as Cassandra, Kafka, and Elasticsearch. Now, in this case how do I configure my SIPEnvironment to support multiple truststores ensuring that each service uses the correct trust configuration? What are some best practices for managing multiple trust sources, for example, certificates and truststores securely?

Yes, the SIPEnvironment supports using a unified trust configuration. To handle multiple truststores, you must merge the individual truststores into a single truststore that includes certificates for all required middleware services that are Cassandra, Kafka, and Elasticsearch. This merged truststore must then be provided to Sterling Intelligent Promising. For more information, see security parameter.

When you merge the truststores, ensure that you complete the following actions.

Use a single password for the merged truststore.
Provide this password to Sterling Intelligent Promising through the appropriate configuration, for example, an environment variable or a secret.
Validate that all required certificates are correctly imported and trusted.

I understand that there are three possible ways to provide the trust certificate to Sterling Intelligent Promising that is by using Secrets, ConfigMaps, or mounted storage. From a best practices and security perspective, which method is recommended as the most appropriate for providing the trust certificate to Sterling Intelligent Promising and why?: From a security and operational best practices perspective, using Kubernetes secrets is the recommended method for providing trust certificates to Sterling Intelligent Promising. That said, you are free to choose the method that best fits your operational setup or constraints. For more information, see additionalMounts parameter.

Once the truststore is built and mounted into the Sterling Intelligent Promising deployment, is it necessary to recreate or regenerate the truststore during every upgrade or redeployment of Sterling Intelligent Promising? Under what conditions, if any, is a truststore rebuild required?: No, it is not necessary to re-create or regenerate the truststore during every Sterling Intelligent Promising upgrade or redeployment, as long as the trust configuration remains unchanged. For more information about the conditions to rebuild a truststore, see Conditions to rebuild a truststore.

For one of my external middleware services, say Cassandra, Kafka, and Elasticsearch, one of my trust certificates got expired, what steps should I take to safely replace the expired certificate with the renewed one in my Sterling Intelligent Promising setup? How can I ensure that all dependent components or pods pick up the new certificate correctly?: Mount the renewed trust certificate by using the same method that you originally used that is whether through Secrets, ConfigMaps, or mounted storage. After mounting the updated certificate, rebuild the truststore to include the renewed certificate. For more information about the annotation to rebuild a truststore, see apps.sip.ibm.com/import-certificate-to-truststore annotation.

I learned that the final truststore is created at /sip-external-certs/client.truststore.jks and the store type is always JKS. Is this truststore type fixed to JKS, or is there a flexibility to configure it to use a different type such as PKCS12?: Yes. Available options for truststore type are PKCS12 and JKS. For more information, see security parameter.

In the context of Sterling Intelligent Promising deployment, I noticed that a trustJavaCACerts flag is available. What are the implications of setting it to true or false?: Configure the ssl.trust.trustJavaCACerts property to true or false based on whether you want your servers to trust the certificates in the default Java TrustStore. For more information, see security parameter.

Middleware or external Services

I am relying on an Operator to set up middleware services such as Cassandra, Kafka, and Elasticsearch. How can I ensure that the middleware data is saved so that it persists even after a restart?

If you are relying on the Operator to set up middleware services, it is essential to configure persistent storage for each middleware service. The configuration of persistent storage ensures that the data is retained across pod restarts or node failures. This is typically achieved by defining the storage property within the SIPEnvironment for each middleware component.

Example:

  storage:
    name: persist_PVC
     accessMode: ReadWriteMany
     capacity: 40Gi
     storageClassName: default

For more information, see storage in external services and storage properties.

What versions of middleware components such as Cassandra, Kafka, and Elasticsearch are officially supported by Sterling Intelligent Promising on-premises containers?

The following middleware versions are officially supported.

Cassandra 4.1.0
Kafka 3.5.0
Elasticsearch 7.17.9

For more information, see Installing middleware services in a production environment.

After I install SIPEnvironment, I need to change the endpoint for one of the middleware services such as Cassandra, Kafka, and Elasticsearch. What is the recommended way to update middleware configurations after the installation of Sterling Intelligent Promising?: Changing the endpoint of a middleware service after Sterling Intelligent Promising is installed is not supported and is discouraged. After Sterling Intelligent Promising is installed, each middleware service is populated with operational data specific to its role in the platform. These components work in coordination, and their data often holds interdependent state. Modifying the endpoint of any one middleware service can lead to data inconsistencies, loss of critical state information, or system instability. The newly configured middleware will not have the synchronized or expected data required by the other components.

Sterling Intelligent Promising officially supports Apache Cassandra as one of its middleware dependencies for handling data storage. I am exploring the use of Astra DB, a cloud-native Cassandra-as-a-Service provided by DataStage, as a potential replacement for a self-hosted Cassandra deployment. Is Sterling Intelligent Promising compatible with Astra DB?: No, it is not supported.

I observed that in development mode, Sterling Intelligent Promising is able to autocreate Kafka topics, but in production mode, autocreatation does not happen. What is the reason behind this behavior?: In development mode, Sterling Intelligent Promising is configured to automatically create Kafka topics on demand to simplify the setup and reduce manual intervention during testing. This is enabled through relaxed Kafka configurations and internal logic that is designed to facilitate faster iteration and ease of use during development. However, in production mode, autocreation of Kafka topics is intentionally disabled to enforce stricter control, consistency, and reliability. Allowing automatic topic creation in production can lead to misconfigured topics, inconsistent data flows, or unintended side effects in message processing. Therefore, in production, Kafka topics are expected to be explicitly created and managed either ahead of time or through a provisioning mechanism, following deployment best practices.

During Sterling Intelligent Promising deployment, if the connection to any middleware service such as Kafka fails, the deployment itself fails. Is there a recommended way to validate connectivity to all required middleware services in the early phase, so issues can be detected and addressed before the actual deployment begins?: Yes, use the apps.sip.ibm.com/validate-external-services-connections annotation to validate the connectivity of middleware services. For more information, see Annotations used in Sterling Intelligent Promising Operator.

Can I configure Sterling Intelligent Promising to disable SSL or TLS and ensure that communication with external middleware services happens over nonsecure or plain text channels?: Yes, you can disable SSL or TLS by using the configuration parameter. For more information, see configuration parameter.

I installed Sterling Intelligent Promising in my environment. During its operation, Sterling Intelligent Promising interacts with various middleware services such as Kafka, PostgreSQL, ActiveMQ, which are populated with configuration and runtime data. Later, Sterling Intelligent Promising is uninstalled, but the middleware services and their data remain intact in the environment. What considerations should I keep in mind regarding the existing middleware data and configuration?: Sterling Intelligent Promising does not perform any cleanup or deletion of data from the middleware services during uninstallation. All configuration and runtime data that is stored in services such as Cassandra, Kafka, Elasticsearch, remain intact even after Sterling Intelligent Promising is removed from the environment. If you intend to reuse or repurpose these middleware services for a fresh installation or for other applications, it is important to manually review and clean up the existing data. Hence, preventing conflicts, stale configurations, or unexpected behavior.

For one of the external middleware services, such as Cassandra, Kafka, or Elasticsearch, the credentials, username or password are changed. What steps should I take for my Sterling Intelligent Promising container setup to safely use these updated credentials? How can I ensure that all dependent components or pods pick up the new credentials correctly?

You must update the secret that is used in the SIPEnvironment with the new credentials. For more information, see Creating a secret. Then, add the following annotation to ensure that all the pods restart and pick up the updated credentials.

apps.sip.ibm.com/restart

Storage

Do I need to create the required PVCs manually beforehand before installing Sterling Intelligent Promising? What will happen if I trigger the Sterling Intelligent Promising installation without creating the necessary PVCs?: Yes. For more information, see Option to create Persistent Volume Claims manually.

In my environment, I plan to use dynamic Persistent Volumes (PVs) for Sterling Intelligent Promising deployment. Given that the trust certificate needs to be available before the deployment starts, how can I ensure that the certificate is placed on the dynamically provisioned volume before the deployment begins? Is there a recommended approach for handling this scenario?: You must manually ensure that the trust certificate is placed on the dynamically provisioned volume before starting the Sterling Intelligent Promising deployment, as the application does not handle this process automatically. For more information, see Creating a Kubernetes Persistent Volume manually.

In the context of Sterling Intelligent Promising deployment, what exactly is stored in the Persistent Volume (PV)? Which components or data require persistent storage?: Persistent Volumes in Sterling Intelligent Promising deployment are used to store the truststore and data for middleware services such as Cassandra, Kafka, Elasticsearch, specially in development mode and if storage is configured. For more information, see SIPEnvironment custom resource manifest.

Logging

I am using Sterling Intelligent Promising on Kubernetes cluster setup, and by default, the logs from the Sterling Intelligent Promising pods are available via standard output. What are the other approaches to collect the logs?: Sterling Intelligent Promising supports two logging modes including console which is default via stdout and Kafka for centralized log collection. For more information, see log parameter.

Upgrade

In a setup where I deploy application services such as Inventory Visibility, Promising, Optimizer, and Utility Service, how can I override or change the default image version for a specific component?

You can override the image version for services at the following three levels using image tags and digests.

SIPEnvironment or global level
Service group level
Application server or backend server level

For more information, see Image configuration precedence.

In a scenario, where the Operator version and the application image versions are out of sync, can this lead to compatibility issues or unexpected behavior? What is the recommended version compatibility matrix between the Operator and the application images, and how should upgrades be managed to ensure stability and compatibility across the stack?: Yes, because each new release may include coordinated changes across both the Operator and application images. The best practice is to keep the Operator and application images up to date with the latest supported versions to ensure stability and compatibility. For more information, see List of Operator versions and application image tags.

During a recent upgrade, I attempted to update the Sterling Intelligent Promising Operator to the latest version. However, the upgrade failed due to unforeseen issues and the environment was unstable. To ensure business continuity, I want to roll back to the previous working Operator version so that the system can continue functioning as expected while the upgrade issues are investigated. How can I do that?: You can roll back to the previous version of the Sterling Intelligent Promising Operator as long as the application images are not updated. Before rolling back, ensure that no application components are upgraded to versions that depend on the newer Operator, as this can lead to compatibility issues. It is recommended to avoid such scenarios by thoroughly testing upgrades in lower environments first. Only proceed with production upgrades once everything is validated. If any issues arise, even in lower environments, it is best to report them promptly and reach out to support for guidance.

During a recent upgrade, I attempted to update the application images to the latest version. However, the upgrade failed due to unforeseen issues and the environment was unstable. To ensure business continuity, I want to roll back to the previous working application images so that the system can continue functioning as expected while the upgrade issues are investigated. How can I do that?: Rolling back application images to older versions is not supported once the application images are upgraded to a newer version, as the application upgrade may involve irreversible changes to the database and other internal components. It is recommended to avoid such situations by thoroughly testing upgrades in lower environments first. Only proceed with production upgrades after validating stability and compatibility. If issues occur, even in test environments, report them immediately and contact support for guidance before proceeding further.

Certificate management

I need to configure the SIPEnvironment environment to use a custom TLS certificate of my choice at the OMS Gateway layer, replacing the default certificate. How can I achieve this?: Yes, you can configure your custom TLS certificate at OMS Gateway layer. For more information, see Custom TLS certificate configuration in OMS Gateway.

Can I configure the SIPEnvironment to use custom certificates for specific services instead of relying on a shared or default certificate?: No, the SIPEnvironment does not support configuring custom certificates at the individual service level. Custom certificates can only be applied at gateway or ingress level, but not for specific services within the deployment.

Does the SIPEnvironment support automatic handling of certificate expiry, rotation, and renewal?: Yes. The CertificateManager focuses on automating certificate regeneration, improving the management of ownership and expiry tracking to ensure smooth and uninterrupted operation. It features automated certificate renewal, regeneration of the certificates ensuring zero downtime, centralized CA secret, timely status tracking, and more. For more information, see Key features and benefits of CertificateManager.

In a scenario where I need to publish Sterling Intelligent Promising container services through a custom domain say sip.example.com for integration or external access. How do I configure the SIPEnvironment to support this?: To configure a custom domain, define the customDomains property within the ingress section of the common configuration. For more information, see Configuring customDomains in SIPEnvironment.

Resilience and scalability

Currently, I have a single cluster deployment of the Sterling Intelligent Promising in a development or staging environment. As I prepare to move to production, I want to transition to a multicluster setup to ensure high availability, fault tolerance, and disaster recovery across regions or data canters. How should I proceed?: Use the multicluster support in Sterling Intelligent Promising containers to transition from a single cluster to multicluster setup. For more information, see Moving from single to multiple clusters.

General

I want to use a different authentication system instead of the default gateway authentication provided by the product. Can I do that ? If yes, how can I disable the built-in authentication to use my own system?: Yes, you can disable the built-in gateway authentication to integrate your own authentication system. For more information, see omsGateway parameter.

During product installation, onboarding jobs are run as Kubernetes jobs. If these jobs fail, the installation does not proceed. After fixing the issue, how do I restart these jobs since Kubernetes jobs do not restart automatically after failing?: You can restart the failed jobs by using the apps.sip.ibm.com/restart annotation. For more information, see Annotations used in Sterling Intelligent Promising Operator.

I must ensure some of the resources such as secrets, configMaps, and storage volumes are mounted only to the intended pods or containers, without exposing them cluster-wide or to unintended workloads. What best practices should be followed to maintain security, resource isolation, and clean configuration management?: You can dynamically mount your resources by using additionalMounts based on labels. For more information, see additionalMounts parameter.

I deployed SIPEnvironment in development mode and currently facing an issue. I want to enable debug mode to collect more detailed logs. What is the recommended way to do this for effective troubleshooting?: Yes, you can enable logging in debug mode. For more information, see log parameter.