Skip to main content

Design for Scalability - an Update

Willy Chiu (wchiu@us.ibm.com), Vice-President, High Volume Web Sites, Software Group (AIM Division)
Willy Chiu is a Vice-President, High Volume Web Sites Software Group (AIM Division)

Summary:  This paper describes component selection and management techniques you can use to make your Web site ready to adapt to increasing traffic. These techniques are the product of IBM's experience working with customers seeking to improve the performance and availability of some of the world's largest Web sites. Updated September 2001.

Date:  17 Apr 2001
Level:  Introductory
Activity:  697 views

Abstract

Optimizing for scalability remains a significant challenge for e-businesses as they balance the demands for availability, reliability, security, and high performance. Vendors are responding with infrastructure options and supporting hardware and software platforms that address these requirements. This update identifies current products and emerging trends that are most likely to improve the scalability of your e-business infrastructure.


Executive summary

The first (December 1999) version of this paper recommended techniques for selecting and managing e-business infrastructure components with the objective of optimizing for scalability and high performance. In that context, we introduced the significance of workload patterns and how different patterns yield different scalability challenges, introduced a process for classifying a Web site and applying scaling techniques, and described the best practices that we had discovered in our work to-date with large IBM customers. A lot has happened in the last 18 months. We present this update to:

  • Confirm our continued understanding that the fundamentals of our recommended methodology for classifying workload and applying scaling techniques remain sound and useful.
  • Refresh the discussion of scaling techniques to include new, available technologies such as edge servers, the JavaTM 2 Platform, Enterprise Edition (J2EE) technologies, and the HVWS Simulator.
  • Illustrate how latency, a key issue in scalability and performance, relates to workload patterns.
  • Provide new statistics that demonstrate the effectiveness of certain scaling techniques.
  • Identify the common mistakes that inhibit successful scaling.
  • Provide guidance on the implementation of two or three physical tiers.
  • Identify emerging trends that will affect scalability.

The significance of designing for scalability and high performance cannot be understated. We know from analysts that slow performance costs e-business sites many millions of dollars per month. Volumes are growing as well. For example, the 2001 Wimbledon site had nearly twice the number of unique users, and three times the number of page views as the 2000 site. An e-commerce site we work with saw their holiday season peak hits grow from just under 1 million hits per hour in 1999 to over 5 million hits per hour in 2000. We work with an online trading site whose page views per day grew from approximately nine million to approximately 16 million over the same period. Page views per day at an auction site grew from approximately 65 million to approximately 200 million over the same period.

You know that the success of your company's e-business depends on your organization's ability to design and implement an infrastructure that yields the measures of high performance, availability, and reliability expected to support the business objectives for revenue and customer satisfaction. The infrastructure consists of hardware, software, and network components you select for their ability to meet your needs of today and tomorrow.

Scalability refers to a component's ability to adapt readily to a greater or lesser intensity of use, volume, or demand while still meeting business objectives. Understanding the scalability of the components of your e-business infrastructure and applying appropriate scaling techniques can greatly improve availability and performance. Scaling techniques are especially useful in multi-tier architectures when you evaluate components associated with the edge servers, the Web presentation servers, the Web application servers, and the data and transaction servers.

The objective of this paper is to introduce scalability and scaling techniques and to help you understand that to optimize for the success of your company's e-business, you must evaluate all new and upgraded components of your infrastructure for scalability. IBM's IT experts have been working with customers to analyze many of the world's largest Internet and intranet sites, including IBM's own, to determine which attributes affected scalability and to help customers implement scalable Web sites. This paper will help you understand your workload patterns and classify your site. You'll learn which scaling techniques are best for specific components and how other large customers have realized the benefits of scaling using IBM's middleware products, such as WebSphere®, MQSeries®, DB2®, and Tivoli®.


Introducing workload patterns and Web site classifications

The IT infrastructures that comprise most high-volume Web sites (HVWSs) present unique challenges in design, implementation, and management. While actual implementations vary, Figure 1 below shows a typical e-business infrastructure comprised of several tiers. Each tier handles a particular set of functions, such as serving content (Web servers such as the IBM HTTP Server), providing integration business logic (Web application servers such as the WebSphere Application Server), or processing database transactions (transaction and database servers, such as CICS® and DB2).


Figure 1. Multi-tier infrastructure for e-business.
Multi-tier infrastructure for e-business.

IBM's IT experts have been working with IBM customers to architect and analyze many of the world's largest Web sites. Figure 2 below shows how IBM's HVWS team defines the life cycle of a Web site. As the HVWS team accumulates experience and knowledge, it compiles white papers aimed at helping IT professionals like you understand and meet the new challenges presented during one or more of the phases. These white papers are available at High Volume Web Sites Zone.


Figure 2. Life cycle of a Web site.
Life cycle of a Web site.

Figure 2 above shows that knowing your workload is the foundation of our recommended best practices for high-volume Web sites. It is key during all phases of the life cycle.

Most high-volume Web sites experience volumes that can vary widely on a seasonal or other cyclical basis, or that exhibit burstiness as a result of sudden and unpredictable changes in user demand. Figure 3 below shows how the volumes of four different actual Web sites vary by as much as a factor of 5 to 10.


Figure 3. Some typical Web site loads over a 24-hour period.
Some typical Web site loads over a 24-hour period.

An example of a bursty site would be an online trading site at the opening of the market, or a sporting site during a very popular event. Such variation underscores the challenge of planning for volumes, and suggests that planning for average volumes is unlikely to be effective. Other factors that define the workload pattern include the volume of page views and transactions, the volume and type of searches, the complexity of transactions, the volatility of the data, and security considerations.

Figure 4 below shows how seasonality can affect retail sites; notice the peak during the December holiday season and how it grew from 1999 to 2000.


Figure 4. Example of a retail site with seasonal peaks, growing from year to year.
Example of a retail site with seasonal peaks, growing from year to year.

Figure 5 below is an example of the broad range of hits per day versus page views per day.


Figure 5. Examples of metrics for page hits per day.
Examples of metrics for page hits per day.

Workload patterns vary, and sites with similar patterns can be classified into site types. We've identified five distinct workload patterns and corresponding Web site classifications. Appendix A contains a guide for determining your workload pattern and selecting scaling techniques.

Publish/subscribe Web sites provide users with information

Sample publish/subscribe sites include search engines, media sites, such as weather.com and numerous newspapers and magazines, and event sites, such as those for the Olympics and the Wimbledon championships. Site content changes frequently, driving changes to page layouts. While search traffic is low in volume, the number of unique items sought is high resulting in the largest number of page views of all site types. As an example, the Sydney Olympics site successfully handled a peak volume of 1.2 million hits per minute using IBM's WebSphere Edge Server. The Wimbledon 2000 site successfully handled a peak volume of 430,000 hits per minute using IBM's WebSphere Edge Server. The Wimbledon 2001 site handled 208.5 million page views, three times the number of the 2001 site, as well as almost twice the number of unique users. Security considerations are minor compared to other site types. Data volatility is low. This site type processes the fewest transactions and has little or no connection to legacy systems.

Online shopping sites let users browse and buy

Sample sites include typical retail sites where users buy books, clothes, and even cars. Site content can be relatively static, such as a parts catalog, or dynamic where items are frequently added and deleted as, for example, promotions and special discounts that come and go. Search traffic is heavier than the publish/subscribe site, though the number of unique items sought is not as large. Data volatility is low. Transaction traffic is moderate to high, and almost always grows. The typical daily volumes for many large retail customers running on IBM's WebSphere Commerce Suite range from less than one million hits per day to over 50 million hits per day, with a range from 100,000 transactions per day to three million transactions per day for the higher-volume sites; of the total transactions, typically between 1% and 5% are buy transactions. When users buy, security requirements become significant and include privacy, nonrepudiation, integrity, authentication, and regulations. Shopping sites have more connections to legacy systems, such as fulfillment systems, than the publish/subscribe sites, but generally less than the other site types.

Customer self-service sites let users help themselves

Sample sites include banking from home, tracking packages, and making travel arrangements. Home banking customers typically review their balances, transfer funds, and pay bills. Data comes largely from legacy applications and often comes from multiple sources, thereby exposing data consistency. Security considerations are significant for home banking and purchasing travel services, less so for other uses. Search traffic is low volume; transaction traffic is moderate, but growing rapidly.

Trading sites let users buy and sell

Of all site types, trading sites have the most volatile content, the highest transaction volumes (with significant swing), the most complex transactions, and are extremely time sensitive. Auction sites are characterized by highly dynamic bidding against items with predictable life times. Products like IBM's WebSphere Application Server have the performance features that enable these sites to meet customer demand. Trading sites are tightly connected to the legacy systems, for example, using IBM's MQSeries for connectivity. Nearly all transactions interact with the back end servers. Security considerations are high, equivalent to online shopping, with an even larger number of secure pages. Search traffic is low volume.

Using Web services, business-to-business sites buy from and sell to each other

These sites include dynamic programmatic links between arms-length businesses (where a trading partner agreement might be appropriate). One business is able to discover another business with which it may want to initiate transactions. Example: supply chain management. Data comes largely from legacy applications and often comes from multiple sources, thereby exposing data consistency. Security requirements are equivalent to online shopping. Transaction volume is moderate, but growing; transactions are typically complex, connecting multiple suppliers and distributors.


Introducing scalability

Most e-businesses face the same challenges, the most significant of which are unpredictable growth and the ability to have solutions ready for unknown problems. If your Web site is typical, it most likely started with displaying company information and has evolved to processing simple, if not, complex transactions. You know now that the skills required to display information are different from those required to process transactions. Failure to optimize graphics, frequent table scans and joins of multiple tables, and the resulting I/O bottlenecks combine to degrade performance. Site availability is stressed by unpredictable traffic and inadequate discipline regarding systems management. The problems you're facing may be compounded by poor application design and systems that are poorly configured, under powered, or both.

Meeting such challenges requires unprecedented flexibility and capacity for change in all areas, especially your IT operation. We believe the success of future e-businesses may be tied to the selection of components that can be individually and/or collectively adjusted to meet variable demands. Such flexibility is called scalability and is a feature your team needs to understand and measure for each component within your infrastructure. Scalability is related to the features of performance (response time) and capacity (operations per unit of time) but should not be considered synonymous.

Scaling a multi-tiered infrastructure from end to end means managing the performance and capacities of each component within each tier. The basic objectives of scaling a component/system are to:

  • Increase the capacity or speed of the component.
  • Improve the efficiency of the component/system.
  • Shift or reduce the load on the component.

As one increases the scalability of one component, the result may change the dynamics of the site service, thereby moving the "hot spot" or bottleneck to another component. The scalability of the infrastructure depends on the ability of each component to scale to meet increasing demands. Figure 6 below illustrates the relationship between performance curves, response time, and the scaling target.


Figure 6. Scalability/performance curves.
Scalability/performance curves.

Notice the "Original Performance Curve" is unable to meet the acceptable level of response time for the scaling target. By scaling components within a Web site, we can improve the performance of the site (see the Improved Performance Curve in Figure 6 above).

Table 1 below introduces the scaling techniques and relates them to the scaling objectives. For example, if your objective is to increase the speed of a component, you would consider using a faster or special machine and/or creating a machine cluster. Altering the load on a component is often less straightforward. For example, an item in a cache can be served up faster than an item in a database. If a large number of requests can be handled using a cache instead of the database, the overall load on the database is reduced, affording greater scalability for the entire system. Frequently, the techniques that reduce load on one component actually make other components more efficient, thus compounding the scaling effect.

Table 1. How scaling techniques relate to scaling objectives.

IDScaling TechniqueIncrease Capacity/SpeedImprove EfficiencyShift / Reduce Load
1Use a faster machinex
2Create a machine clusterx
3Use a special machinexx
4Segment the workloadxx
5Batch requestsx
6Aggregate user datax
7Manage connectionsx
8Cache data and requestsxx

Six steps to scaling your infrastructure

We recommend you consider this high-level approach to classifying your Web site and learning which scaling techniques could be applied. The approach is systematic, but you and your best IT architects will need to improvise and adapt the approach to your situation. There are six steps:

  1. Understand the application environment.
  2. Categorize your workload.
  3. Determine the components most impacted.
  4. Select the scaling techniques to apply.
  5. Apply the techniques.
  6. Reevaluate.

Knowing your workload pattern (publish/subscribe and customer self-service, for example) determines where to focus your scalability efforts, and which scaling techniques to apply. For example, a customer self-service site such as an online bank needs to focus on transaction performance, and the scalability of databases that contain customer information used across sessions. These considerations would not typically be significant to a publish/subscribe site.

Step 1. Understand the application environment

For existing environments, the first step is to identify all components and understand how they relate to each other. The most important task is to understand the requirements and flow of the existing application(s) and what can or cannot be altered. The application is key to the scalability of any infrastructure, so a detailed understanding is mandatory to scale effectively. At a minimum, your analysis must include a breakdown of transaction types and volumes as well as a graphic view of the components in each tier.

Figure 7 below can help you determine where to focus your scalability planning and tuning efforts. The figure shows where latency is greatest for representative customers in three of the workload patterns, and in which tier you should concentrate for each. For example, for online banking, most of the latency typically occurs in the database server, whereas the application server typically experiences the greatest latency for online shopping and trading sites. The way applications manage traffic between tiers significantly affects the distribution of latencies between the tiers, which suggests that careful analysis of application architectures is an important part of this step and could lead to reduced resource utilization and faster response times. You should collect metrics for each tier, and make behavior predictions for your users for each change you implement. WebSphere Application Server has application programming interfaces that provide detailed data useful for monitoring application performance.


Figure 7. How latency varies based on workload pattern and tier.
How latency varies based on workload pattern and tier.

As you analyze requirements for a new application, you have the opportunity to build scaling techniques into your infrastructure. New applications offer you the opportunity to consider all that is new in the areas of each component type, such as open interfaces and new devices, the potential to achieve unprecedented transaction rates, and the ability to employ rapid application development practices. Each technique affects application design; similarly, application design impacts the effectiveness of the technique. To achieve proper scale, application design must consider potential scaling effects. In the absence of known workload patterns, you'll need to follow an iterative, incremental approach.

Step 2. Categorize your workload

All site types, like yours, are considered to have high-volumes of dynamic transactions. Your site type will become clear as you evaluate your site for the other characteristics that pertain to transaction complexity, volume swings, data volatility, security, and others. If you need help, refer to Tables 2 and 3 in the Appendix.

Step 3. Determine the components most affected

This step involves mapping the most important site characteristics to each component. Once again, from a scalability viewpoint, the key components of the infrastructure are the edge servers, the Web application servers, security services, transaction and data servers, and the network. Table 4 in the Appendix specifies the significance of each workload characteristic to each component. As you can see, the affect on each component is different for each workload characteristic.

Step 4. Select the scaling techniques to apply to scale the workload

It is worth the best efforts of your IT architects to collect the information needed to make the best scaling decision. Only when the information gathering is as complete as it can be is it time to consider matching scaling techniques to components. Manageability, security, and availability are critical factors in all design decisions. Techniques that provide scalability but compromise any of these critical factors cannot be used.

Here's a summary of the eight scaling techniques.

  1. Use a faster machine
    This technique applies to the edge servers, the Web presentation server, the Web application server, the directory and security servers, the existing transaction and data servers, the network, and the Internet firewall. The goal is to increase the ability to do more work in a unit of time by processing tasks more rapidly. A faster machine can be achieved by upgrading the hardware or software. However, one of the issues is that software capabilities can limit the hardware exploitation and vice versa. Another issue is that due to hardware or software changes, changes may be needed to existing system management policies.

  2. Create a cluster of machines
    This technique applies to the Web presentation server, the Web application server, and the directory and security servers. The primary goal here is to service more client requests. Parallelism in machine clusters typically leads to improvements in response time. Also, system availability is improved due to failover safety in replicas. The service running in a replica may have associated with it state information that must be preserved across client requests, and thus needs to be shared among machines. State sharing is probably the most important issue with machine clusters and can complicate the deployment of this technique. IBM WebSphere's workload balancing feature uses an efficient data sharing technique to support clustering. Issues such as additional system management for hardware and software can also be challenging.

  3. Use appliance servers
    This technique applies to the edge servers, the Web presentation server, the directory and security servers, the network, and the Internet firewall. The goal is to improve the efficiency of a specific component by using a special purpose machine to perform the required action. These machines tend to be dedicated machines that are very fast and optimized for a specific function. Examples are network appliances and routers with cache, such as the IBM WebSphere Edge Server. Our experience with the Wimbledon 2001 Web site demonstrated tremendous benefits by using caching; it can reduce up to 85% of the HTTP traffic to the presentation servers. Some issues to consider regarding special machines are the sufficiency and stability of the functions and the potential benefits in relation to the added complexity and manageability challenges. It's worth noting, however, that the newer generation of devices are increasingly easy to deploy and manage; some are even self-managed.

  4. Segment the workload
    This technique applies to the Web presentation server, the Web application server, the data server, the intranet firewall, and the network. The goal is to split up the workload into manageable chunks thereby obtaining more consistent and predictable response time. The technique also makes it easier to manage which servers the workload is being placed on. Combining segmentation with replication often offers the added benefits of providing an easy mechanism to redistribute work and scale selectively as business needs dictate. An issue with this technique is that in order to implement the segmentation, one needs to be able to characterize the different workloads serviced by the component. After segmenting the workload, additional infrastructure is required to balance physical workload among the segments, for example, the use of the IBM WebSphere Edge Server.

  5. Batch requests
    This technique applies to the Web presentation server, the Web application server, the directory and security servers, the existing business applications, and the database. The goal is to reduce the number of requests sent between requesters and responders (such as between tiers or processes ) by allowing the requester to define new requests that combine multiple requests. The benefits of this technique arise from the reduced load on the responders by eliminating overhead associated with multiple requests. It also reduces the latency experienced by the requester due to the elimination of the overhead costs with multiple requests. Some of the issues are that there may be limits in achieving reuse of requests due to inherent differences in various requests types (such as Web front end differs from voice response front end). This can lead to increased costs of supporting different request types.

  6. Aggregate user data
    This technique applies to the Web presentation server, the Web application server, and the network. The goal is to allow rapid access to large customer data controlled by existing system applications and support personalization based on customer specific data. When accessing existing customer data spread across existing system applications, the existing applications may be overloaded, especially when the access is frequent. This can degrade response time. To alleviate this problem, the technique calls for aggregating customer data into a customer information service (CIS). A CIS that is kept current can provide rapid access to the customer data for a very large number of customers; thus, it can provide the required scalability. An issue with a CIS is that it needs to scale very well to support large data as well as to field requests from a large number of application servers (requesters).

  7. Manage connections
    This technique applies to the Web presentation server, the Web application server, and the database. The goal is to minimize the number of connections needed for an end-to-end system, as well as to eliminate the overhead of setting up connections during normal operations. To reduce the overhead associated with establishing connections between each layer, a pool of preestablished connections is maintained and shared among multiple requests flowing between the layers. For instance, most application servers provide database connections managers to allow connection reuse. It is important to note that a session may use multiple connections to accomplish its tasks, or many sessions may share the same connection. This is called connection pooling in the WebSphere connection manager The key issue is with maintaining a session's identity when several sessions share a connection. Reusing existing database connections conserves resources and reduces latency for application requests, thereby helping to increase the number of concurrent requests that can be processed. Managing connections properly can improve scalability and response time. Administrators must monitor and manage resources proactively to optimize component allocation and use.

  8. Cache
    Caching is a key technique to reduce hardware and administrative costs and to improve response time. Caching applies to the edge server, the Web presentation server, the Web application server, the network, the existing business applications, and the database. The goal is to improve the performance and scalability by reducing the length of the path traversed by a request and the resulting response, and by reducing the consumption of resources by components. Caching techniques can be applied to both static and dynamic Web pages. A powerful technique to improve performance of dynamic content is to asynchronously identify and generate Web pages that are affected by changes to the underlying data. Once these changed pages are generated, they must then be effectively cached for subsequent data requests. There are several examples of intelligent caching technologies that can significantly enhance the scalability of e-business systems. The key issue with caching dynamic Web pages is determining what pages should be cached and when a cached page has become obsolete.

Rather than buying hardware that can handle exponential growth that may or may not happen, consider specific approaches for these two types of servers:

  • For application servers, the main technique for growth path is to add more machines. It is therefore appropriate to start with the expectation of more than one application server with a dispatcher in front, such as IBM's WebSphere Edge Server. Adding more machines then becomes painless and far less disruptive.
  • For data servers, get a server that is initially oversized; some customers run at just 30% capacity. This avoids the problem in some environments where the whole site can only use one data server. Another scaling option when more capacity is needed is to partition the database into multiple servers.

Most sites we studied separate the application server from the database server. They place front-end Web serving and commerce application functions on less expensive, commodity machines and the data server on more robust and secure but more expensive systems. The trend with many publish/subscribe sites is to put the Web server on eServer pSeries or PCs and to put the application and databases on larger systems such as high-end pSeries or zSeries systems. In many accounts, the most important performance tuning factor becomes the data server. Many commerce sites do not have large databases and achieve improved performance by caching most or all of the databases in memory.

Step 5. Apply the technique(s)

Apply the selected technique(s) in a test environment first to evaluate not only the performance / scalability impact to a component, but also to determine how each change affects the surrounding components and the end-to-end infrastructure. You do not want a situation where improvements in one component result in an increased (and unnoticed) load on another component. Figure 8 below illustrates the typical relationship between the techniques and the key infrastructure components. By using this figure, you can identify the key technique for each component. In many cases, all techniques cannot be applied because one or more of the following is true:

  • You cannot afford to invest in the techniques, even if it would help.
  • You won't perceive the need to scale as much as the techniques will provide.
  • Your cost / benefit analysis shows that the technique will not result in a reasonable payback.

The IT architect must therefore have a process for applying these techniques in different situations so that the best return is achieved. This mapping is a starting point and shows the components to focus on first based on your workload.


Figure 8. Scaling techniques applied to components.
Scaling techniques applied to components.

Step 6. Reevaluate

As with all performance related work, tuning will be required. The goals are to eliminate bottlenecks, scale to a manageable status those that can't be eliminated, and work around those that can't be scaled. One of IBM's large customers scaled its peak load 34 times, and achieved improved peaks as high as 12 million hits per hour using a combination of these techniques. Some of their tuning actions and the benefits realized were:

  • Increasing Web server threads raised the number of requests that could be processed in parallel.
  • Adding indexes to the data server reduced I/O bottlenecks.
  • Changing defaults for several operating system variables allowed threaded applications to use more heaps.
  • Caching significantly reduced the number of requests to the data server.
  • Increasing the number of edge/appliance servers improved load balancing.
  • Upgrading the data server increased throughput.

Such results demonstrate the potential benefits of systematically applying scaling techniques and continuing to tune. Figure 9 below shows performance data from a recent scaling study for a Web site that uses IBM WebSphere Application Server.


Figure 9. Scaling a WebSphere online trading site.
Scaling a WebSphere online trading site.

Additional techniques

One of the most important techniques for high-volume sites with complex transaction workloads is to push as much work as possible toward the network. This allows the back end server to be tuned to handle critical transaction workload. Horizontally-scaled commodity servers can be used for page serving. An effective caching strategy can improve nearly every site's scalability.

IBM implemented a reusable infrastructure for its sporting event sites. Many practices have been refined over time. Caching is extensive. Scores are built into static pages that can be cached in the network, either in routers or ISP caches. When the scores change, the affected pages are dynamically updated and the caches invalidated. In general, the use of caching provides the quickest response time and lowers the system load since the network cached pages can be served directly to users without a hit to the Web server. Other practices considered most significant are flattening of product catalogs and fast-path shop design. This is the approach used for the product catalogs. Catalog pages are flattened into static Web pages that can be cached like other static pages for the event site. For every 100 browsers on the site, only 1 to 5 actually buy anything. 95% of the time, browsing is done without accessing the catalog database. The frequently accessed pages are even cached out in the network so the Web server load is lower too.

Managing runtime performance has special challenges. More than ever, this task requires a perspective that considers components from the front-end browsers to the back-end database servers. The end-to-end perspective must be understood and shared by the IT management, operations staff, application developers, and Web site designers. Also required are stated business goals, thoughtful performance objectives, and thorough performance measurements.

We've developed an end-to-end methodology that aligns the system performance with the underlying business goals. The methodology combines proactive monitoring, analysis, alert, and predictive mechanisms. To read about our end-to-end performance methodology, see the HVWS white paper Managing Web Site Performance.

An e-business must plan for success and look to the future. IBM developed the High-Volume Web Site (HVWS) Simulator to estimate the performance of complex configurations. The HVWS Simulator is an analytic queuing model that estimates the performance of a Web server based on workload patterns, performance objectives, and specified hardware and software. The results can be used as guidelines for configuration sizing. Performance results are displayed in sufficient detail to allow users to assess the adequacy of a given configuration for their requirements, and to provide insight into where the bottlenecks are likely to occur. This allows the simulator to be useful for capacity planning, evaluation of infrastructure and workload changes, projecting Web site scalability, and reducing the cost of prototyping.

Through monitoring and adding traffic load predictions with capacity estimates, we can achieve a level of automation for self-managed systems. A self-managed system should be able to handle a wide variety of workload patterns with satisfactory performance. They should be able to recover gracefully from the stress of unexpected workloads. Therefore, it is important to monitor system performance in real time, collect necessary data, and feed that data to predictive tools that can predict workload increases ahead of time. WebSphere Application Server now has application programming interfaces that provide detailed data useful for monitoring the performance of applications.

Finally, managing quality of service can increase the availability of resources to prioritized requests. This involves allocating limited sharable resources to the requests that need them most by classifying and qualifying requests based on business policies. For example, you may classify users who buy or are paying ahead of users who browse or are searching. This technique is directly applicable to online shopping (registered shoppers or frequent shoppers), self-service, and online trading sites. You may also distinguish between requests based on complexity. For example, separate the typical simple transactions (product display) from the complex (payment authorization, tax/shipping/discount calculations).


Common pitfalls

For over two years now, the HVWS team has been working closely with customers to analyze their e-business infrastructures across all phases of their life cycles, from planning and architecture through development, test, and deployment. This section summarizes the most common mistakes customers make when implementing their e-business infrastructures.

Planning phase

Customers often have weak linkages between their business planning and IT shops, resulting in poorly defined scalability targets and growth plans. For example, when a business team plans a major ad campaign, it must give the IT organization sufficient notice to prepare the infrastructure to meet demand. One customer experienced what we call "e-panic" to rapidly install 26 new T1 lines to handle a special promotion where they originally thought one T1 line would suffice.

Customers sometimes fail to scale their business processes as well as their e-business infrastructure. We worked with a "click and mortar" business that couldn't deliver holiday season goods bought through their Web site because their inventory was insufficient to fulfill their total orders. One health insurance company wisely deferred deployment of an e-business application until its back office business process could be scaled to anticipated volumes.

Architecture phase

Some IT shops try to "reinvent the wheel." Instead of focusing on their business problem and their applications, they develop and implement e-business middleware. We've seen homegrown workload balancers, HTTP servers, and Web application servers. This can introduce major cost, maintainability, and scalability issues. With the popularity of J2EE standard platforms, such as WebSphere, reinventing occurs less frequently.

Customers who don't design for scalability limit their options when volumes increase. It is difficult to project volumes, and because volumes can grow quickly, it is best to have a scalable architecture that can cope with unexpected growth . One large customer launched a new site and, on the first day, exceeded its projection for the first year. Many customers have had reasonable success by projecting new site traffic based on existing site traffic; others have found projecting traffic based on business expectations to be effective. We recommend that you plan for workload balancing, and deploy a workload balancer from day one, even if you only start with a single server. You can add capacity easily when growth occurs.

Design phase

Customers don't design Web pages with a performance objective. The latest consultant studies indicate that many consumers are unwilling to wait more than eight seconds for a page to download. Well-designed Web pages load faster and enable a site to handle more users. For specific design suggestions, see the HVWS white paper Design Pages for Performance.

Development and test phases

The key point about these phases is that there are no shortcuts in the e-business environment. In fact, due to the complexities of the multi-tier infrastructure, and the external visibility of performance and availability problems, even more rigor is required.

Many customers struggle to adhere to a development and test process that provides adequate quality on the aggressive schedules demanded by the Internet. Too often we see compromises in the testing phase, especially in stress testing. Stress testing is a powerful tool in ensuring the scalability of a Web site, and in identifying and fixing bottlenecks that inhibit scaling.

We have tested scalability with one of our online trading customers to ensure that their new, J2EE-based architecture would scale to very high volumes. We conducted a recent scalability test with a bank, including testing connected to their production data servers. This proved to have unique value in that it identified bottlenecks typically not found in a controlled test environment, for example production firewall settings and limitations in bandwidth to the data servers.

Deployment phase

Inadequate change control procedures cause site outages and slowdowns. In one instance, a bank's customers' balances were visible to others due to a site applying an untested application change. In another case, configuration changes at a bank resulted in limits being set on the number of users able to access the system through a firewall. Many customers were unable to log onto their accounts.

Both of these IT shops had been around long enough to appreciate the importance of change control, but had made unwise exceptions to their policies due to the time pressures and complexities of their Web environments.

Complex, multi-tier Web environments require cross-functional cooperation and processes. To manage availability and performance across a multi-tier, multiplatform environment, groups that may have operated independently in the past need to work closely. A customer transaction's availability and performance depends on networks, firewalls, and often both UNIX® and mainframe servers. Operational procedures need to span all of these components, covering a complete end-to-end perspective.


Choosing between two and three tiers

Multi-tier e-business infrastructures provide opportunities for improved scalability and performance not found in the early days of e-business. These opportunities come with complexities and challenges that we are just beginning to understand. Figure 10 below summarizes the variety of multi-tier infrastructures in use by many of IBM's large customers.

Most modern application architectures require flexibility and thus are usually divided into logical layers for presentation logic, business logic, and data serving. The infrastructure of most large sites is comprised of two or more physical tiers with application function distributed among participating servers. Depending on workload pattern, presentation and business logic may coreside in one tier, with data serving in a separate tier; or, each may have its own tier; thus creating a three-tier infrastructure. Scaling is easier with a multi-tier architecture. As customers scale for increased volumes, they are implementing their applications on the J2EE computing platform, which facilitates implementation of scalable multi-tier architectures.

At the highest level, we know that the two-tiered infrastructure is simple, performs well, and is easiest to maintain. In many two-tier implementations, customers place both presentation logic and business logic on the first tier. However, with just one firewall between the Internet and the business logic, the two-tiered design has potential security exposures. These can be mitigated by implementing a three-tier structure, moving the business logic to a middle tier behind another firewall. However, as each tier is added, there are scalability and performance considerations. In some cases, it may make sense to consolidate all logical layers into a single large system complex.

The optimal solution for your site depends on your workload characteristics and application environment. A publishing site such as weather.com has a lot of presentation logic and data, but not much business logic. An online trading site typically has moderate presentation logic and a lot of business and data logic. The flexibility of having multiple tiers and multiple servers creates opportunities, challenges, and complexities. WebSphere is exceedingly well suited to provide the flexibility and robustness needed for these multi-tier alternatives.


Figure 10. Examples of two-, three-, and four-tiered infrastructures.
Examples of two-, three-, and four-tiered infrastructures.

Emerging standards and technologies

Emerging standards and technologies will affect scalability, some in ways we are just starting to understand.

When scaling to new volumes, customers are implementing their Web applications using J2EE as the standard enterprise computing platform, which makes it easier to implement scalable multi-tier architectures. Customers are looking at Linux as a way to create more open solutions at reduced cost. Many are considering the implications of Web services. Web services is an evolving standard way to expose applications or resources on a network. It consists of enabling technologies that allow processes and subprocesses to be located and accessed over the Web. Web services exploit e-business infrastructures by allowing them to focus on their core competencies and use partners to supply the remaining capabilities their business needs to operate. At a recent Web services show, 60% of the IT executives surveyed expect to use Web services for internal applications, such as retrieving customer information from a customer relationship management system or passing transactions from front-office systems to back-office systems. Customers are beginning to implement some of these standards. For example, XML is increasingly used for managing content and communicating between tiers. We recently implemented an architecture to access DB2 tables using Web services and DB2 XML extenders.

Self-managing servers are a key new IBM offering in response to our customers' challenge of managing their complex infrastructures comprised of multiple tiers and multiple servers. IBM's new eServer z900 and z/OS offerings will be able to reallocate processing power to a given application, based on the workload demands of the moment, and thus enabling tremendous capacity expansion and minimally disruptive scalability.

Pervasive computing devices such as cellular phones, PDAs, and handheld computers are becoming prevalent around the world. In geographies such as the Far East and Europe, the wireless realm has been exploding. This area is also growing in the U.S., although at a more moderate pace. Consultants estimate that by 2003 wireless devices will represent 2 billion Internet access points, more than 60% of all Internet devices. Our customers are asking us how to best structure their applications to add support for new devices easily and with minimum disruption, and about the scalability of systems that will support millions of wireless users. The delivery of video and music to wireless devices is also being explored and its performance is being tested. This application will create some interesting performance and scalability challenges. IBM is participating in this exciting new area with the WebSphere Everyplace Server (WES). WES provides an integrated, robust solution designed to ease the entry into pervasive computing and facilitate future scaling as needed.


Summary

Designing, integrating, and managing scalable multi-tier infrastructures can be complex, challenging tasks. However, knowledge, techniques, standards, products, and best practices are increasingly available to help you. This paper provides updated information on IBM's experiences with large customers who face these same challenging tasks, many of whom are enjoying the benefits that come with improved scalability and performance.

The techniques introduced in this paper are:

  • Use a faster machine
  • Create a machine cluster
  • Use an appliance server
  • Segment the workload
  • Batch requests
  • Aggregate user data
  • Manage connections
  • Cache data and requests

A combination of clustered machines, connection management, and caching has enabled many large customers to scale their Web sites to handle peak loads. We've seen increases in peak loads as high as 34-fold, and with improved peaks as high as 12 million hits per hour. Other customers have seen 40-50% improvements in the download time for their Web pages by applying techniques directed at page performance.

While scaling end-to-end infrastructures is becoming more science than art, it still presents the best hope for responding to and managing unpredictable demands on your Web site. With thorough knowledge of such features of your current infrastructure as workload patterns, Web site components, skills, and budget, your IT architects should begin now to factor scalability considerations and scaling techniques into their plan for the future.

The challenges to architect a scalable infrastructure are many and real, and they keep coming, almost at Internet speed. The corresponding opportunities to use new devices and techniques are just as plentiful and real, as are the business opportunities for those who do it right. IBM has products and services that can help you implement the scalable infrastructure needed to make your company's e-business succeed inside and out.


HVWS white papers about best practices

As the HVWS team accumulates experience and knowledge, it compiles white papers aimed to help you understand and meet the new challenges presented during one or more of the phases of a Web site's life cycle. These papers are available at High Volume Web Sites Zone.

Here are abstracts for the HVWS white papers:

Manage Web site performance, May 2001
To fine-tune Web site performance, you must consider all infrastructure components, from the browser to the database servers and legacy systems. This paper introduces a methodology for managing end-to-end infrastructure performance and identifies best practices and tools that help implement the methodology.

Charles Schwab puts growth plan to the test, May 2001
This paper describes a joint project between IBM and Charles Schwab and Co., Inc. to develop an architecture for Schwab's Web site that could cope with soaring growth. The paper includes a description of the configuration, the test itself, and the test results.

Plan for growth, November 2000
Learn whether your Web site can satisfy future demands and evaluate potential workload and infrastructure changes. This paper also introduces the concept of configuring a Web site based on an analysis of how different components combine to best meet the performance objectives of your particular workload pattern, potentially reducing the costs of prototyping and stress testing. Includes specific example data and graphs, plus sample scenarios scripts you can reuse to break down users' behavior at online shopping, banking, and trading sites.

Design pages for performance, May 2000
Need help figuring out how to reduce the time it takes to download your Web pages? Find out how to cut download times and improve resource utilization by following the design advice here, gleaned from optimizing efforts at high-volume sites.

Web site personalization, February 2000
This paper introduces current and future techniques for personalizing your Web site. Techniques for maximizing the performance of personalized Web sites, such as content caching, are also discussed.


Appendix A. Understanding your workload

This appendix contains a summary of the five HVWS workload patterns and three tables that you may wish to use when categorizing your Web site and determining which characteristics most affect the components of your infrastructure.

Table 2. Workload patterns and Web site classifications.

Publish / SubscribeOnline ShoppingCustomer Self-ServiceTradingWeb Services / B2B
Categories / ExamplesSearch engines

Media

Events
Exact inventory

Inexact inventory
Home banking

Package tracking

Travel arrangements
Online stock trading

Auctions
eProcurement
Content Dynamic change of the layout of a page, based on changes in content, or need.

Many page authors and page layout changes frequently.

High volume, non user specific access.

Fairly static information sources.
Catalog either flat (parts catalog) or dynamic (items change frequently, near real time).

Few page authors and page layout changes less frequently.

User specific information: user profiles with data mining.
Data is in legacy applications

Multiple data sources, requirement for consistency
Extremely time sensitive

High volatility

Multiple suppliers, multiple consumers

Transactions are complex and interact with back end
Data is in legacy applications

Multiple data sources, requirement for consistency

Transactions are complex
SecurityLowPrivacy, nonrepudiation, integrity, authentication, regulations Privacy, nonrepudiation, integrity, authentication, regulations (Banking); Low for others Privacy, nonrepudiation, integrity, authentication, regulations Privacy, nonrepudiation, integrity, authentication, regulations
Percent Secure PagesLowMediumMediumHighMedium
Cross-session InfoNoHighYesYesYes
SearchesStructured by category

Totally dynamic

Low volume
Structured by category

Totally dynamic

High volume
Structured by category

Low volume
Structured by category

Low volume
Structured by category

Low to moderate volume
Unique ItemsHighLow to MediumLowLow to MediumModerate
Data VolatilityLowLowLowHighModerate
Volume of transactionsLowModerate to HighModerate and growingHigh to Very High (Very large swings in volume)Moderate to Low
Legacy Integration/ ComplexityLowMediumHighHighHigh
Page ViewsHigh to Very HighModerate to HighModerate to LowModerate to HighModerate

If your application has a characteristic not included in Table 3 below, add it if you believe scalability would be affected.

Table 3. Characterizing your workload.

Publish / SubscribeOnline ShoppingCustomer Self-ServiceTradingWeb Services / B2BYour Workload
High Volume, Dynamic, Transactional, Fast GrowthYesYesYesYesYesYes
Volume of User-specific ResponsesLowLowMediumHighMedium
Amount of Cross-session InformationLowHighHighHighHigh
Volume of Dynamic SearchesLowHighLowLowMedium
Transaction ComplexityLowMediumHighHighHigh
Transaction Volume SwingLowMediumMediumHighHigh
Data VolatilityLowLowLowHighMedium
Number of Unique ItemsHighMediumLowMediumMedium
Number of Page ViewsHighMediumLowMediumMedium
Percent Secure Pages (privacy)LowMediumMediumHighHigh
Use of Security (authentication, Integrity, nonrepudiation)LowHighHighHighHigh
Other CharacteristicsHigh

Table 4 below is generalized and may not match your specific workload. Remember, if you added a characteristic to the previous table you will need to add that characteristic to this table as well.

Table 4. Determine the components most affected.

Edge ServersWeb Presentation ServerWeb Application ServerSecurity ServersTransaction ServersData ServersNetwork
High volume, dynamic, transactional, fast growthHighHighHighHighHighHighHigh
High percent user-specific responsesLowLowHighLowHighHighLow
High percent cross-session informationLowMedHighLowLowMedLow
High volume of dynamic searchesMedHighHighLowMedHighMed
High transaction complexityLowLowHighMedHighHighLow
High transaction volume (swing)LowMedHighLowHighHighLow
High data volatilityLowHighHighLowMedHighLow
High number unique itemsLowLowMedLowHighHighLow
High number page viewsHighHighLowLowLowLowHigh
High percent secure pages (privacy)LowHighLowHighLowLowLow
High securityLowHighHighHighHighLowLow

Contributors

These HVWS team members contributed to this update: Willy Chiu, Paul Dantzig, Harish Grama, Rich Grega, Linda Legregni, and Joe Spano.

The HVWS team is grateful to the Super Scalable Architecture team: Maggie Archibald, Michael Conner, Paul Dantzig, Daniel Dias, Greg Flurry, Parag Gondhalekar, Leonard Hand, Richard McDonald, and Mark Palmer.


Special notice

The information contained in this document has not been submitted to any formal IBM test and is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.

Top of page


About the author

Willy Chiu is a Vice-President, High Volume Web Sites Software Group (AIM Division)

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=87572
ArticleTitle=Design for Scalability - an Update
publish-date=04172001
author1-email=wchiu@us.ibm.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Rate a product. Write a review.

Special offers