In Arthur’s previous proof of concept, he set up an auto scaling group using Amazon’s Elastic File System (EFS) to provide a High Availability and Disaster recovery solution for IBM MQ. In this proof of concept, we will be taking the same AWS setup and modifying it to use a Ceph storage cluster as our storage system instead of EFS. Although here we will be using AWS as our cloud provider, in theory the topics covered here could be applied to any cloud system.
Ceph is an open source clustered storage solution, it is not tied to any particular cloud provider and so can be used with AWS, OpenStack, Bluemix etc. Although this proof of concept uses Ceph we will not be discussing how to set up a Ceph storage cluster, instead we assume you already have a storage cluster setup and you’re now configuring AWS and MQ to use it. I also assume that you have read Arthur’s previous blog post and that you understand the concepts covered there, I will not be explaining auto scaling groups or the configuration here in detail – just the differences and problems I had to solve in order to convert Arthur’s example to work with Ceph.
I’ve included links to the Cloud Formation template and scripts that I used to create this proof of concept, it should be noted that you may not be able to take the scripts and run them without first modifying them to use your Ceph storage cluster. You can find the example template and configuration files on this github gist.
The Ceph Storage Cluster
The diagram below shows the storage cluster that I had set up prior to creating the auto scaling group.
My Ceph Cluster contained six machines over three different zones. Each of the machines contained one Ceph object store (OSD), in addition one of the machines also had the Ceph monitor (that controls the Ceph OSDs) and the Ceph Dashboard webserver running on it. The Ceph dashboard was a handy GUI webserver I found, written in python, that could be used to provide simple statistics about my Ceph cluster. It was useful for seeing the status of the 6 nodes and also what effect was had on the cluster when I “destroyed” one of the nodes to simulate a sudden loss of machine/zone.
When choosing whether to use the Ceph file system or Ceph block storage I opted to use Ceph block storage for this proof of concept. The reasons for this was that for the initial investigation into using Ceph with MQ I wanted to make sure that the file system underneath MQ was a file system that I knew would work with MQ. Using block storage, I would have control over the type of file system created on the storage and so could chose a file system that I knew was supported, with CephFS the file system is provided for you and so additional investigation would be needed to ensure that it worked with MQ. Although we are using block storage in this example, it is likely we will be re-visiting Ceph in the future and trying CephFS with MQ.
Creating the MQ with Ceph Image
Once the storage cluster was up i created the image that would be used on the EC2 instances. This instance would need to the following installed on it in order to be usable:
- IBM MQ
- Ceph Client (to connect into the Ceph storage Cluster)
- Two configuration scripts that will be executed by the auto scaling group upon instance creation:
- AWS Command line
The config.mqsc script is the same as in Arthur’s proof of concept and simply configures the Queue Manager ready for client applications to connect in via the PASSWORD.SVRCONN channel. The aws-configure-mq-ceph.sh script however is a new script and is split into 4 main sections:
First, the script uses the AWS command line to retrieve two Ceph configuration files form S3. These files are needed by the Ceph client to connect to the Ceph storage cluster, they are uploaded to S3 ahead of running the script.
Second, we use the configuration files to connect to the Ceph Cluster and ask it to allocate storage for our MQ data root (if it doesn’t exist already) and give it a supplied label. We then ‘connect’ the storage disk to our instance ready to be used.
Thirdly, we then attempt to mount the storage device, if that fails we assume it is because the file system does not exist on it and so we create a file system that MQ supports on it. Once created we then mount the file system in the /var/mqm directory
Finally, with the file system mounted under /var/mqm we then create the MQ Data root on it before creating and starting a Queue Manager (unless it already exists). We then run the configuration script against this Queue Manager to configure it for Client connections.
Explaining the Cloud Formation template
Once I had the image I needed to create the EC2 instances with I then created the Cloud Formation template, I used Arthur’s version as a baseline but removed all of the sections on creating and configuring the EFS volume that obviously was not going to be used in this proof of concept. I also altered the UserData to pass in the new parameters and added those new parameters as requirements to run the Cloud Formation template. Ceph requires additional ports to run (6789 and also the port range 6800 – 7100) so I opened those in the Security group.
The final change I made revolved around solving the problem of “How do I get the required Ceph files to the Ceph client?”. A Ceph client requires two files in order to connect to the storage cluster: A configuration file to tell it where the storage cluster is and a key file to allow it to connect. The solution I chose was to use AWS S3 storage and IAM Roles to control access.
The S3 bucket (with its two objects) was created before running the Cloud Formation template, the IAM Role and policy (permissions) to access the S3 bucket are created in the Cloud Formation template. By applying a IAM Role with policies to an instance we are able to connect to the S3 bucket via the AWS CLI without having to supply username/passwords or access keys. This allowed the instance on creation to retrieve the files it needed for the Ceph client to connect to the storage cluster.
Once all of the necessary changes were made I ran the template in AWS to create the stack. Once created the auto scaling group and instance worked the same as in Arthur’s blog, the only difference in operation was that the MQ Data Root was mounted on a file system that was being stored in a Ceph Cluster as opposed to Amazon’s EFS.
In conclusion this proof of concept was designed to show that MQ could be used with Ceph, which I’m happy to say it can. Because Ceph is accessed over a network, your Ceph Cluster could be anywhere as long as AWS can access it and in theory Ceph could work on any platforms whether it is on-prem platforms or cloud platforms. Although there is still some investigation that needs to be performed before you select Ceph as your storage cluster technology of choice (for example performance, cost, etc), it is good to know what Ceph can work with MQ and as such is a viable as a storage technology you can use to replicate MQ data across multiple availability zones in your HA infrastructure.