Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Comment lines by Alexandre Polozoff: Consider multiple cells for redundancy and availability

Alexandre Polozoff (polozoff@us.ibm.com), WebSphere Services Consultant, IBM
Author photo
Alexandre Polozoff is a Master Inventor and Senior Certified IT Specialist in the IBM Software Services for WebSphere Performance Technology Practice for the WebSphere suite of products. In this role, he works with IBM customers on various high volume and performance related engagements. Mr. Polozoff has an extensive 20 year background in network and telecommunications management, application development, and troubleshooting. He has also published papers and speaks at various conferences on performance engineering best practices.

Summary:  A multiple cell strategy within your IBM® WebSphere® Application Server environment enables you to address planned (and unplanned) maintenance while still providing 24x7 availability. This content is part of the IBM WebSphere Developer Technical Journal.

Date:  04 Nov 2009
Level:  Intermediate PDF:  A4 and Letter (27KB | 6 pages)Get Adobe® Reader®
Also available in:   Chinese  Portuguese

Activity:  7987 views
Comments:  

Planning your backup plan

Having a "B" cell is much like having a "plan B."

A two- (or more) cell strategy provides the capability to divert traffic to an alternate cell (cell B) while maintenance is being applied to a primary cell (cell A). Similarly, if any problems are discovered after cell A is activated after the changes are applied, it can simply be shut down and all traffic will flow to cell B. This is also useful in cases of new application deployments, fixpacks, testing out new configuration parameter settings, and so on.

You can have a multiple cell strategy and share the same physical servers. Instead of building out the existing cluster or cell, creating a cell B on the same nodes provides redundancy without the added cost of additional hardware. However, having cells deployed on dedicated hardware, while more expensive, does provide less chance that a hardware failure will affect more than one cell.

Multiple cells provide the ability to selectively enable specific infrastructural components to participate in an active production configuration. Through careful control and configuration, various parts of the infrastructure can be removed from the production environment on a planned or unplanned basis.

The objectives of multiple cell configurations are to enable you to:

  • Easily move users from one running environment to another.
  • Minimize (or eliminate) downtime when taking down a part of the environment for planned or unplanned maintenance.
  • Easily revert to a previously known configuration, should a catastrophic failure occur in the primary production environment.
  • Prevent accidental changes to an active configuration running in production.

The load balancer in IBM WebSphere Application Server is configured to send traffic to either cell A, cell B, or both. If a cell needs to be removed for maintenance or an upgrade, the load balancer can be directed to send traffic to just the cell that is running. If the environment is expanded to have more than two cells, the same basic strategy applies.


Figure 1. WebSphere Application Server load balancer
Figure 1. WebSphere Application Server load balanacer

Safety and security through scripting

All changes to your WebSphere Application Server configuration should be executed through wsadmin scripts. Your scripts should be able to:

  • Enable (make active) a configuration at the load balancer.
  • Disable (make inactive) a configuration at the load balancer.
  • Make changes to an inactive configuration.

Identify active cells and configurations

An important part of the process is knowing which cell or configuration is active (identified as one of the above permutations). There are a couple of ways you can do this:

  • Easiest: Static objects on an HTTP server can identify which cell is active. This can be a text object that identifies the active configuration. The scripts can access the static object (for example, through the use of curl) and use that to determine the active configuration. The static object can be a simple text file built by the script as it activates various parts of the run time environment, or as it analyzes each of the configuration files.
  • More thorough: Write a custom servlet to dynamically determine and display the current active configuration. This will be able to pull information directly from the various configuration files to determine which cell is active, which grid the applications are pointed to, what version of the application is running, what the version of the coherence data is, and so on. This also enables easier auditing of the environment to ensure that everything is properly configured in production. Of course, this will involve some development effort.

Some monitoring should be provided on the active configuration as a matter of course, and an informational alert should be sent to those responsible for the infrastructure when a change occurs. This enables immediate analysis in case the change was accidental or otherwise unplanned.

Preventing changes to active servers

Your scripting should also incorporate some level of intelligence to further prevent production downtime, with regard to which configuration is active so that changes are not made on active servers. Scripts should use the captured information mentioned above to identify the active configuration, and be able to map the configuration to specific servers, which are also active.

You could provide an override flag to enable actions on active servers, but this should require some level of userID/password authorization – at the very least -- to do so.

Reverting to a previous configuration

The easiest way to back out to a previous deployment and configuration of the production environment is to not immediately make any changes to the “inactive” configuration after a new configuration is made active. Leave the inactive configuration in a warm ready state (that is, the servers are left running even though no traffic is flowing through that configuration). If it is determined that the new (or current) active configuration is not working as desired, you can simply point the load balancers to the previous configuration and make the inactive configuration active again. This enables troubleshooting in production without affecting the availability of the environment.


Summary

Providing redundancy is one strategy for sites with high availability requirements. Scripts that perform repeatable and test-able tasks provide confidence in knowing that changes made in one cell can be propagated to the other cells. Scripts also manage where users get directed to, whether it's to a particular cell or group of cells, and that the shift is accomplished seamlessly. Upgrades and other maintenance activities can also be conducted with assurance that a viable back out strategy is available. With adequate hardware isolation, any unplanned outages can be contained to one cell. Likewise, migrations and fixpack updates can be conducted independently on each cell without affecting the others.

In other words, coming up with a plan B should always be a part of plan A.


Resources

About the author

Author photo

Alexandre Polozoff is a Master Inventor and Senior Certified IT Specialist in the IBM Software Services for WebSphere Performance Technology Practice for the WebSphere suite of products. In this role, he works with IBM customers on various high volume and performance related engagements. Mr. Polozoff has an extensive 20 year background in network and telecommunications management, application development, and troubleshooting. He has also published papers and speaks at various conferences on performance engineering best practices.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=442723
ArticleTitle=Comment lines by Alexandre Polozoff: Consider multiple cells for redundancy and availability
publish-date=11042009
author1-email=polozoff@us.ibm.com
author1-email-cc=