 | Level: Intermediate Mahi R. Inampudi (inampudi@us.ibm.com), Lead IT Architect, IBM Murali Narasimhadevara (muralin@us.ibm.com), Senior IT Architect and Webmaster, IBM
10 Jan 2006 Learn how one team, the IBM intranet portal
team, upgraded the IBM internal enterprise applications
infrastructure. This article, the first in a series,
explains the problems to be solved, the proposed solutions,
and how the team uses the features of WebSphere®
Extended Deployment to achieve their goals.
Introduction
Enterprise hosting centers face growing challenges in
meeting Service Level Agreements (SLAs), while carefully
managing infrastructure costs. This series explores the
need for and implementation of autonomic computing
methodologies. You'll learn how you can optimize your
resource usage, and better control costs, by following
this team's success. The IBM intranet portal team chose
IBM's WebSphere Extended Deployment to upgrade IBM's
internal enterprise applications infrastructure. This
series discusses the architecture, deployment model, and
lessons learned from the deployment of WebSphere
Extended Deployment and application resiliency design
patterns.
If you're a CIO, IT Architect, or IT Infrastructure
Manager looking to deliver better resiliency, reduce
costs, and increase application availability, then this
article is for you. By introducing autonomic computing
into your environment and using the features of
WebSphere Extended Deployment, you can optimize your
resource usage and better control infrastructure costs.
You can also improve your ability to meet SLAs.
 |
Service level
agreement (SLA) A contract that defines an
understanding between a service provider and their
customer. It sets expectations for performance, and
defines the procedure and reports needed to track
compliance. An SLA should contain the service to be
performed, performance expectations of the service
provider, process for reporting problems with service,
time frame for problem resolution, penalties for
noncompliance, escape clauses, and more. |
|
This article explains the benefits of IBM's recent
deployment of WebSphere Extended Deployment into its
intranet production environment. It outlines the
business problems, and solutions, that guided the team's
selection of products. To better understand this
journey into autonomic computing technology, we'll first
give an overview of the environment and business
requirements behind the project.
IBM's corporate intranet
IBM employees use the corporate intranet portal, or set
of applications, to work, collaborate, and learn in an
On Demand Workplace environment. (See the On
Demand Workplace series for more information.)
The intranet portal is based on dynamic infrastructure,
and provides profiled delivery of content and access to
new tools and applications to help employees manage
their workflow, collaborate, and gain knowledge. The
intranet portal is a framework where many
business-critical applications are aggregated using
WebSphere Portal, WebSphere Extended Deployment,
Tivoli® products, DB2®, and more.
Business problems
The IBM intranet infrastructure handles high volumes of
traffic, averaging 30 million requests a day, while
maintaining a sub-second transaction response time for
many applications. Enterprise applications within the
On Demand Workplace share common services and hosting,
with special dedicated systems for high volume and
critical applications. Some applications, for example,
had a dedicated infrastructure to support the peak load
that the application receives between 10 a.m. to 2 p.m.
every day.
Special events, such as executive webcasts or
collaborative online jams, drove new capacity
requirements. A side effect of deploying applications on
dedicated hardware is an unacceptable amount of white
space with low utilization on some systems. And, due to
sharing of key services such as common front-end proxy
infrastructure, back-end DB2 resources, and Lightweight
Directory Access Protocol (LDAP) directories lookups,
problems in any particular service layer were affecting
the overall availability of most applications.
A comprehensive strategy was needed to solve the
following business problems.
- Capacity sizing
- Over-provisioning of infrastructure, indeterminate
sizing methods, peak traffic during special events,
costly capacity sizing processes
- Utilization
- Low resource utilization, excess white space in CPU
and memory
- System administration
- Manual and error-prone administrative methods for
creating additional resources, slow response in
procuring additional capacity and application
deployment, end- to-end monitoring with SLAs not
being met
- Resiliency
- Cascading failures with no application isolation or
control, availability expectations challenged
- Cost concerns
- Hosting costs, maintenance, and support
To solve these problems, the team explored using
autonomic computing principles and IBM's new breed of
products such as Virtualization engine (VE), WebSphere
Extended Deployment, Enterprise Workload Manager (eWLM),
and Tivoli Intelligent Orchestrator. With the goals of
shared infrastructure and dedicated resources at the
portfolio level, the team examined the pros and cons of
deploying all intranet-related applications into a
virtualized environment, and sharing resources across
all applications. The latest release of WebSphere
Extended Deployment would solve many of the business problems.
WebSphere Extended Deployment Version 6.0 enables a
Business Grid, a dynamic, goals-directed,
high-performance application environment for running
mixed application types and workload patterns in
WebSphere applications. This technology extends the
capabilities of the WebSphere platform, helping you deal
with IT scalability and performance challenges.
Solutions
When deploying WebSphere Extended Deployment, our goal
was to solve the problems, using the associated
solutions, outlined in Table 1.
Table 1. Problems and solutions
| Problem | Solution |
|---|
| Lack of application resiliency | Allow intranet applications to recover from
unexpected problems, and provide high
availability with WebSphere Extended Deployment
On Demand Router (ODR) layer. | | Inefficiency | Server utilization increases by virtualization
of computing resources into pools, which are
shared among applications and portfolios based
on application criticality and usage on demand.
Achieved by combining separate WebSphere
clusters into a single huge cluster, and
assignment of proper service policies. | | High cost | Server consolidation and reduced system
administration will provide cost savings, but
won't compromise the capacity of any intranet
portfolio applications. Consolidate all hosted
WebSphere clusters across business organizations
into a single entity, driving down 25-35%
overall hardware costs. | | Poor autonomic computing | Provisioning and deprovisioning WebSphere
Application Server servers for an application
based on the resource usage and peak usage.
Lets other critical applications use the virtual
resources that WebSphere Extended Deployment creates. | | Poor monitoring | Monitor all portfolio applications in a single
dashboard approach, making it easier to see the
big picture and pinpoint problems quickly in an
environment with numerous applications, leading
to higher availability. |
 |
Architecture
In the past, the IBM intranet WebSphere infrastructure
included multiple dedicated WebSphere clusters. The
rationale for dedicated clusters was: criticality of the
application, a need to isolate applications from one
another, and the load of some of the applications.
Dedicated clusters made the cost of the entire portfolio
infrastructure very high because of extra maintenance
and required hardware. The actual capacity is usually
more than what the portfolio needs, and there is
considerable white space on all the dedicated servers
that existed to support peak traffic requirements. With
WebSphere Extended Deployment, the team was assured of
resource availability for critical applications on an
all-shared hosting environment, and reduction in total
hardware resources.
Typical WebSphere Extended Deployment architecture
includes the ODR component in front of the back-end
application servers (supports both WebSphere and
non-WebSphere application servers). ODR weighs each type
of URI and calculates how many resources it typically
needs for processing. It also decides if the
application is consuming more than it needs to for a
given resource, and thereby detects application failures
needing corrective action.
Figure 1 shows our WebSphere
Extended Deployment architecture.
Figure 1. WebSphere Extended
Deployment Architecture diagram
A typical flow through the architecture in Figure 1:
- Application clients such as Web browsers make
calls to URLs.
- WebSphere Caching Proxy server either serves the
response from its local cache for static
content, or forwards it to the Web servers or ODRs.
- WebSphere Caching Proxy rules decide whether the
request goes to the IBM HTTP Server cluster, or
ODR cluster, depending on whether the URL is a
static or dynamic request.
- If the request is a dynamic servlet request, ODR
begins a proxy server then decides which
back-end node's application server should handle
the request. The decision is based on the data
it captures regularly to determine the resource
utilization on back-end nodes, and how much
capacity (such as CPU cycles) this URL or
transaction might require for processing.
- WebSphere Extended Deployment's administrative
console's run-time reports and charts show a lot
of information on the run-time status of all
back-end nodes and performance of ODRs. The
performance information at the transaction or
URI level would help with problem determination.
For example, if a URI that involves LDAP calls
slowed down considerably, showing the response
times on the admin console's charts would tell
developers or administrators there is probably
an LDAP issue at that moment.
 |
Service policies
Making sure that critical applications had the required
resources, when needed, was a high-priority goal for our
team. A predictive feature of WebSphere Extended
Deployment is the use of service policies. This
section briefly describes WebSphere Extended Deployment
service policies, and how the policies helped meet the goal.
As mentioned in Business problems,
detecting dedicated infrastructure for critical
applications, and servers that are greatly under
utilized during non-peak hours, are great benefits of
WebSphere Extended Deployment. WebSphere Extended
Deployment can govern all the critical applications by
providing them the required resources when they're
needed. This becomes very important when there is
resource contention between heavily used, less-critical
applications versus very critical applications. When
machines are overused, critical applications are
serviced, while applications given lesser precedence
receive fewer resources.
Service policies are a factor in controlling application
placement. Placement lets administrators prioritize the
work and define the business or performance goals of
applications. Just like other autonomic computing
designs, WebSphere Extended Deployment includes policy
definitions. Architects, working with business teams,
define the application's business goals within the
service policies by assigning the importance of each
application and creating a business goal. Service
policies work with the ODR to meet the application's
business goals. The information is fed into WebSphere
Extended Deployment, which uses the information to
protect critical application resources on shared
environments, allowing the infrastructure to still
achieve the desired SLAs.
Figures 2 and 3
show the process our team used to create mappings
between applications in the infrastructure and the
governing policies that need to be applied to WebSphere
Extended Deployment.
Figure 2 shows different service
policies that are created for the intranet portal, and
the importance levels. During the application boarding
plan, each application owner or architect needs to
answer questions to help the CIO Technology Team decide
which service level is right for a given application.
The questions include: business criticality of the
application, revenue generated by the application, and
technical information such as a list of transactions,
expected average response times of each transaction, and
so on.
Figure 2. Applications and service
level mapping
Figure 3 shows the types of service
policies and the different range of response times for
the transactions that fall under a given service level.
For example, sample application A, identified as
Platinum, could have three different transactions, each
falling under three different buckets of response time
expectations. Transaction 1 URIs are expected to give
an average response time of 500 msec, transaction 2
under the 1500 msec bucket, and transaction 3 under 3000
msec average response times.
Figure 3. Service levels and response
time bucket mapping
Health policies
This section briefly introduces WebSphere Extended
Deployment health policies. WebSphere Extended
Deployment provides a health management and monitoring
system. While the WebSphere Extended Deployment
environment saves infrastructure costs by increasing
server resource utilization, it also encourages shared
infrastructure for applications. It's common for some
applications to sometimes have odd behavior and suddenly
overuse resources, such as CPU or memory. Oftentimes,
under certain conditions some applications will show
memory leaks. WebSphere Extended Deployment's health
management protects applications and infrastructure from
such common scenarios using health policies.
A health policy configuration defines preventative
and detection-based policies to ensure the vitality of
your server environment. A system restart is used to
flush out the environment. WebSphere Extended Deployment
provides four different types of health conditions that
can be monitored:
- Age-based
- Workload
- Excessive memory
- Excessive response time condition
For example, WebSphere Extended Deployment, depending on
the reaction mode configured, can either simply monitor,
or send event notification e-mail to the developer or
administrators. When in "supervise" mode,
WebSphere Extended Deployment creates corrective actions
that need to be approved by the administrator. When in
"automatic" mode, WebSphere Extended
Deployment can automatically take actions, such as
restart the server or take thread dumps. For IBM
intranet applications, the excessive memory usage health
policy is configured.
Conclusion
This first part in our series introduced how the IBM
intranet portal team strived to achieve autonomic
computing in an on demand environment. This article
explained the business problems and solutions outlined
by the team, and their rationale for choosing WebSphere
Extended Deployment to solve specific problems within
the intranet framework. We also introduced the high
level architecture, and some of the cost-saving
features, of WebSphere Extended Deployment.
The next installment in our series will go into details
about WebSphere deployment tasks, and will provide
lessons learned and best practices. A subsequent
article will discuss some of the autonomic application
resiliency design patterns for application developers.
Stay tuned!
Acknowledgements
The authors would like to thank Brian K. Martin, Anthony
R. Tuel, Wolfgang Segmuller, Priyanka Jain, and Keith
Smith from the IBM WebSphere Software Group for their
help in deploying WebSphere Extended Deployment in the
IBM intranet infrastructure.
Resources Learn
Discuss
About the authors  | 
|  | Mahi R. Inampudi is the lead IT architect for IBM's On
Demand Workplace expertise location system (BluePages).
Other responsibilities include the architecture and
solution design for several of IBM's internal offerings
and collaborating with the CIO office and IBM Research
helping design applications using the latest SOA
methods. Recent interests include leveraging newer
technologies, such as WebSphere Extended Deployment, the
Rational product suite, and IBM's intraGrid architecture. |
 | 
|  | Murali Narasimhadevara is a Senior IT architect with
the IBM CIO office. Murali is also the Senior Webmaster
for the IBM intranet, and has been helping develop it
for the past eight years. He has extensive experience in
building and managing high volume Web sites, application
and Web server administration with a focus in WebSphere,
performance/capacity planning, and enterprise
application design. His areas of interest are in
autonomic and utility computing for managing Web infrastructures. |
Rate this page
|  |