This article identifies a possible design option for implementing high availability. The main design goal for implementing high availability is 100% up time which typically requires redundant hardware and software to manage the environment. This article does not take into consideration disaster recovery plans which introduce other complexities about location of the redundant equipment. This article does not address human errors which often contribute to system failures. This article assumes that only two physical servers are available. Additional comment s will be made to discuss what can be done to improve performance if more than two servers are available.
As indicated above one of the most common methods to ensure high availability is to introduce redundancy into the overall architecture of the implementation. Out of the Box Cognos provides high availability within the application servers in the case of failure but in order to provide complete failover other hardware and software capabilities can be introduced to provide additional assurance. Listed below is a description on the various software and hardware components that make up a production ready high availability solution.
This design pattern is based on a modular design. Each physical server in the module is installed with the same software stack. In this example the software components include IBM Cognos BI Server , IBM WebSphere® Application Server, IBM HTTP Server, IBM DB2 Enterprise Server Edition, and IBM Tivoli® System Automation for Multiplatforms (Tivoli SA MP).
Each server in the module is installed with the same Cognos components but all of the components will not necessarily be turned on for each server. The following is a review of the cognos components and then a description of how they are implemented for high availability with the concept of nodes.
Gateway: This component receives requests from users, encrypts passwords, gathers the information needed to submit the request to a Cognos server, and passes the request to a dispatcher for processing. For high availability Tivoli SA MP software is used to implement a virtual service IP address that is referenced by users to connect to the gateway on the corporate network. A resource group is created and managed by Tivoli to detect if either the http server or the machine itself is not accessible.
Report service: This service manages report requests and presents output to users through a Web portal called Cognos Connection
Content Manager: This service manages the storage of application data such as security, configuration data, models, metrics, report specifications, and report output. It is needed to publish packages, retrieve or store report specifications, manage scheduling information, and manage the Cognos namespace
For high availability we will introduce the idea of a BI module. I t represents a combination of cognos components, high availability software and or hardware. The BI module must contain one BI type 1 node and one BI type 2 node, and it can contain between zero and either two or four BI extension nodes.
With this design we also have introduced a concept called tier. Tiers are a logical grouping of services. In our case the gateway tier is the http service, application tier is the report processing tier and in this case the database engine as well. The data tier is a high speed network mounted file system that contains the databases used by the application tier.
The three types of nodes in a BI module follow:
BI type 1 node: The primary role of this node is to handle user requests through the Cognos gateway and to process report requests. The number of reports processed is based on the weight associated with the Cognos dispatcher and depends on the number of BI extension nodes in the module. This node also hosts a standby Cognos Content Manager and provides failover support for the content store. Every BI module includes one type 1 node.
BI type 2 node: The primary role of this node is to host the active Cognos Content Manager, to host the content store database and the audit database, and to process report requests. The number of reports processed is based on the weight associated with the Cognos dispatcher and depends on the number of BI extension nodes in the module. The gateway on this node provides failover support for the gateway on the type 1 node. Every BI module includes one type 2 node.
BI extension node: An optional node introduced if higher performance is required. The primary role of this type of node is report processing, and it is configured to have a maximum report processing capacity. The gateway is installed but not used, and the Content Manager is installed but not started. Every BI module includes between zero and either two or four extension nodes. As you add more extension nodes to the module, a higher number of users can generate reports concurrently A mutual failover high availability configuration is implemented for the BI module using native Cognos BI Server functionality to manage the active Cognos Content Manager and Tivoli SA MP to manage high availability resource groups for the Cognos gateway and BI module database resources.
The primary role of the type 1 node is to handle user requests through the Cognos gateway and to process report requests. The resource group for the Cognos gateway is initially assigned to this node, and its Content Manager is in a standby state. If the server fails or becomes unable to connect to the corporate network, the Tivoli SA MP software detects that the type 1 node is unavailable and automatically performs the failover. The failover and subsequent failback follows this sequence of events:
Failover scenarios: A mutual failover high availability configuration is implemented for the BI module using native Cognos BI Server functionality to manage the active Cognos Content Manager andTivoli SA MP is used to manage high availability resource groups for the Cognos gateway and BI module database resources.
If the type 1node fails, the Tivoli SA MP software detects the failure and transfers the Cognos gateway resources to the type 2 node. If the type 2 node fails, the Cognos BI Server application detects the failure and designates the type 1 node as the active Content Manager. Tivoli SA MP also detects the failure and transfers the BI module database resources to the type 1 node. If an extension node fails, no failover is needed because report processing is available in the other servers.
Both file systems mounted in the BI module are located on the external storage allocated to the BI module. The /coghome and /cogfs file systems contain the DB2 instance that manages the Cognos content store and audit databases, and both are mounted on the type 2 node during normal operations. If the type 2 node fails over, the type 1 node mounts the /coghome and /cogfs file systems and manages the active content store and audit databases .
The figure below shows the minimum configuration possible, in which there is one type 1 node, one type 2 node, and no extension nodes. As this diagram shows, the gateway service IP is assigned to the type 1 node, and therefore user requests are routed to the gateway on the type 1 node. The Content Manager on the type 2 node is active. The Content Manager on the type 1 node is in a standby state, and polls the active Content Manager at a regular interval to determine whether the active Content Manager is still available. The report service is active on both nodes.
Design considerations for improving performance
As mentioned above introducing the BI extension node into the architecture will allow for improved performance for the following reasons. Disabling the Report Service on the type 2 node would be a more optimal configuration since the content store process consumes both cpu and memory as the workload increases. The BI extension node could then share report requests with the type 1 node instead of the type 2 node. The fail over scenarios would work the same as above, except for the fact that the BI Extension node will be handling the Report Request if the type 1 node fails.
A hardware load balancer could also be introduced in front of the gateway servers so multiple gateway servers can process the http requests. To obtain better performance the data tier should also include the dbms and the content manager server process as well. Stay tuned for another post on design options for performance. For more details on this topic please refer to links below.
Note: Most of this material was obtained from the IBM Smart Analytics Business Intelligence Module Guide (LC27-3637-00).
For more information on these kinds of architectures please refer to Administration and Security 10.1.0 at http://publib.boulder.ibm.com/infocenter/cbi/v10r1m0/index.jsp
And IBM Smart Analytics Business Intelligence Module Guide at: