Running CDC on OpenShift Virtualization Clusters
As enterprises modernize their infrastructure, integrating traditional data replication tools like IBM® InfoSphere CDC (Change Data Capture) into virtualized environments becomes increasingly important. While CDC is not containerized, it can be installed and operated on OpenShift Virtualization (RHOSv) VMs, enabling compatibility with cloud-native platforms without requiring containerization.
Deployment Architecture
In a typical deployment, the CDC engine runs inside an OpenShift Virtualization VM, while the Access Server (AS) and Management Console (MC) are hosted externally (outside the OpenShift cluster). This hybrid setup enables CDC to run within RHOSv while maintaining external connectivity for monitoring and administration.
Key Configuration Steps
- Download OpenShift CLI Tools – Download both oc and virtctl tools from your OpenShift cluster. These tools are required for managing cluster resources and interacting with VMs.
- Expose CDC Port via LoadBalancer Service – Create a Kubernetes LoadBalancer service using
virtctl to allow external access to the CDC instance running inside a VM. This
exposes the CDC port and generates an external hostname for communication with the AS and MC. Run
this command to create the LoadBalancer
service:
To verify the service, run:./virtctl expose virtualmachineinstance <vm-name> \ --name <service-name> \ --type LoadBalancer \ --port <external-port> \ --target-port <cdc-port>./oc get svc <service-name>Note: A separate LoadBalancer service must be created for each CDC instance. - Configure Load Balancer Idle Timeout and Access Server Keep-Alive Timeout – To ensure
reliable and continuous communication, the Load Balancer Idle Timeout must be properly configured in
relation to the keep-alive settings used in CDC. CDC replication involves three primary types of communication:
- Communication between CDC source and target engines
Even when the source and target engines are running on the same VM, their communication may still be routed through the external network. By default, each instance creates a Comms.INI file in its instance configuration directory with a keep-alive timeout set to 20 seconds. To ensure uninterrupted replication between CDC engines, the Load Balancer Idle Timeout must be set to more than 20 seconds.
- Communication between AS and CDC source and target enginesAS does not set a keep-alive timeout by default. To avoid idle disconnections, create a Comms.INI file in the AS installation directory with the same 20-second value:
[SETTING] KEEP_ALIVE_TIMEOUT=20Important: The file name is case-sensitive on Linux and must be exactly Comms.INI.With this configuration, as long as the Load Balancer Idle Timeout is greater than the AS’s keep-alive timeout, the connection between AS and CDC engines remains active.
- Communication from MC to CDC source and target engines (triggered by user actions)
Connections initiated by the MC do not implement a keep-alive mechanism. Consequently, the Load Balancer's idle timeout (configured within the OpenShift cluster) determines how long these connections remain open during periods of inactivity. For instance, with a 600-second idle timeout, the MC disconnects from the datastore after 10 minutes of no activity. This does not affect ongoing replication between CDC engines, only the MC user experience is impacted. You can reconnect at any time from the MC.
To keep datastore connections active manually, you can retrieve events from both the source and target. Just clicking on a subscription is not sufficient, the action must involve actual CDC agent activity.
For AWS environments, you can configure the idle timeout by adding this annotation to your service:annotations: service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: "600"Note: This setting is specific to AWS. Configuration may differ in other cloud providers.
- Communication between CDC source and target engines