Standard CDC Engine Configurations Supported with Cloud
One of the most common questions is: "Can CDC work with source and/or targets servers hosted in the cloud?". The answer to the question is:
- CDC will support any environment (virtual or bare metal, on premise or cloud-based) that meets the software requirements for the product. You need to do the standard verification that the hardware being used is supported as well as the version of database. Also, as per any other environment you need to validate that the port requirements for CDC are met as well. Generally CDC's software requirements are fairly standard and so most cloud offerings can deliver environments that satisfy them.
Other common questions are can CDC replicate from on-premises to cloud? Can CDC replicate in a cloud? Each of these questions lead to other questions such as can the data inflight from on-premises to cloud be encrypted? The answer is absolutely, and there are multiple different ways this can be accomplished.
There are a wide variety of different configurations that CDC can support with respect to Cloud environments as illustrated below:
All of the above configurations allows the benefits of Change Data Capture technology such as low impact near real-time database replication, couples them with secure data delivery, and all the advantages of cloud.
The following is a non-exhaustive set of examples given to illustrate some of the possibilities. One of the simplest configurations is when CDC is replicating within the cloud as illustrated below:
As mentioned above, you would need to verify that the standard hardware and software requirements are met. Refer to the following link for the latest supported CDC platforms and databases for IIDR 11.4.
The most common requirement of data replication with cloud is replicating from On-premises to Cloud as illustrated in the following diagram:
There are two main deployment models that can be used to implement the above scenario. The options are:
- CDC target engine is installed on the on-premises server and utilizes a remote JDBC connection
- CDC target engine is installed in the cloud
Option one would look like:
In the above, the data is moving from the on-premises server to the Cloud database server via remote JDBC. One of the common requirements when replicating from on-premise to cloud is encrypting the data in flight. There are three main techniques used to encrypt the data in flight over a remote JDBC connection:
- Utilize a VPN
- In flight data can be encrypted via SSH
- For DB2 LUW and Oracle target databases you can utilize SSL
For details on how to configure SSH or SSL for the above scenarios, refer to the following document.
The following diagram illustrates the configuration of the CDC source on-premises, and the CDC target in the Cloud:
In the above scenario, the same requirement exists for encrypting the data in flight from the on-premises server to the cloud DB server. There are multiple options that you can use to secure the data in flight. The most common option would be to us a VPN. If you do not want to utilize VPN, then either SSH or SSL can be used to encrypt the data in flight between the CDC source and target engines.
Specialized CDC Applies
CDC has a specialize apply to target IBM Cloudant. When targeting Cloudant from on-premises, you would install the CDC source and the CDC target on premises, and the connection to Cloudant is via utilizing REST APIs. The connection to Cloudant is secure utilizing HTTPS
CDC has a specialized apply to target Hadoop utilizing WebHDFS REST API. The connection to Hadoop can be secured using HTTPS. Additionally, the CDC WebHDFS apply supports Hadoop installs which are configured to use Kerberos security. With this you now have the ability to stream change data from key operational systems into a big-data system for timely and reliable insight, and improved customer engagement in cloud.
How to Configure and Operationally Control CDC within Cloud
There are considerations on your deployment options when you take into account the configuration and operational aspects of CDC. The key consideration is if you want to utilize the Management Console (MC) GUI for administration of your replication infrastructure. The MC GUI runs on MS Windows and connects in via access server to the CDC agents. Of course you could always have MC running on a Windows machine outside of the cloud as long as you have the required ports open to your CDC cloud agents. In most environments, it would not be possible to have the required ports open into the cloud environment. In this case, you would have the following options:
- Install MC on a Windows machine outside of the cloud
- Have access server installed outside of the cloud
- Connect from access server to the CDC agent in the cloud via VPN or alternatively via SSH
- Install MC on a Windows machine outside of the cloud
- Have access server installed inside the cloud
- Connect from MC to access server in the cloud via VPN or alternatively via SSH
If your entire CDC replication infrastructure is within the cloud, then you could chose to also have access server installed within the cloud and utilize CHCCLP for all configuration and operational control of CDC.
Of course if all the CDC source and target engines (agents) are installed on-premises, then you would also install access server/MC on-premises, and have no restrictions since the only item connecting into the cloud is a remote database connection.
Was this topic helpful?
11 June 2020