In this blog I would like to highlight my thoughts around providing a highly available environment for TXSeries systems. I would also be talking about the High availability products that helps to achieve building a solution.
An instance of a CICS region is what constitutes a TXSeries system - which often becomes a single point of failure if your business logic (your applications) runs in the single region. A simple solution to this is to have the TXSeries product configured with HACMP (on AIX), or a container (on Solaris) high available (HA) product. Here we will define a primary and a backup server and the HA product would take care of routing the incoming requests to the backup server if the primary is down (either a planned or unplanned shutdowns). So this way your business applications is always made available for the end users - However there are pitfalls with this solution. One such pitfall is that there would be a downtime when the switch over happens between the primary and secondary servers. This can cause the end users to drop out of your system and would mandate them to reconnect to business after a short period of interval. There are other pitfalls like for instance if the primary server goes in to an hang state where it will not be able to service more requests, or the CPU consuming might be so high that it would service the incoming requests quite slowly. Hence this might NOT be an acceptable solution for a pure 24X7 environment.
The other solution I was thinking is an interesting one because it not only provides a highly available solution but also a scalable environment for your applications. I am talking about the work load management (WLM) tool coupled with the high available (HA) products. The WLM tool is provided along with the TXSeries product at no extra charge and is available only on AIX. The mix of these tools would provide a robust solution for your applications to run in a TXSeries environment.
The architecture of this solution would be such that firstly, you will need to have a network dispatcher which could spray your incoming requests across all the physical servers that are connected to it... These servers would primarily be your COR regions (Client Owning Regions) which will further send the requests to one of the AOR regions (Application Owning Regions). Your business logic would then run on these AOR regions which will service your end user requests. A major advantage that I see with this solution would be that the system is always available for the end user and there is no down time seen by the end user. The solution also manages to work load the incoming requests better because the WLM tool not only helps to re-route the requests to another AOR region when one AOR region is down, but also route requests based on the health of a given AOR region.
You might also want to have a read through on one of my earlier blogs where I detail on some of the better ways of doing work load management for TXSeries systems.
I hope some of this sharing would help you build a robust highly available solution involving TXSeries CICS product. I will be glad to hear your thoughts and comments on this. Please feel free to share any comments.
India CICS User Group
Matching: cluster X