WebSphere Commerce is the industry-leading solution for web retail. In this article, we assume that you are familiar with WebSphere Commerce. Therefore, the focus is on the performance aspects of the product. Architecturally, WebSphere Commerce is a J2EE application that is deployed in the WebSphere Application Server (hereafter called Application Server) infrastructure that fully uses Application Server scalability and performance features, such as clustering and dynacache. Dynacache is a feature of Application Server and is leveraged by WebSphere Commerce to improve performance of retail web sites.
WebSphere Commerce sites heavily leverage dynacache to reduce database roundtrips, and thus gain an important performance boost. WebSphere Commerce uses dynacache to cache whole pages (JSPs), page fragments, and commands. Dynacache is an in-process cache. Figure 1 illustrates that each member of the Application Server cluster contains its own instance of dynacache, including copies of each cache entry. Each WebSphere Commerce instance contains its own cache within its JVM.
Figure 1. Each member of the WebSphere Application Server cluster has its own instance of dynacache
Each instance of dynacache contains the same, although not necessarily identical (more on this later), cache entries. Typically, the dynacache-based Application Server topology has to keep the many dynacache instances in synch. If cache entry "a" in Figure 1 is invalidated by one of the Application Server servers, dynacache informs the other cluster members of the invalidation (failure to do so leaves many shoppers viewing stale data) by dispatching an invalidation message via the WebSphere Application Server Domain Replication Service (DRS).
When planning for high volume sites, planners must consider the impact of invalidation traffic on network bandwidth and plan accordingly. WebSphere eXtreme Scale is a distributed shared cache technology. Figure 2 shows the conceptual difference between the dynacache-based and eXtreme Scale-based caching topology. The eXtreme Scale-based topology has a single logical instance of the cache that is shared among WebSphere Commerce servers. Since this cache is shared across servers, multiple copies of the same pages and fragments are not needed for each. Instead, a single cache instance is created on the first request for that page or fragment, and is then available to all WebSphere Commerce servers sharing the cache.
Figure 2. Logical eXtreme Scale-based topology
The following sections discuss how eXtreme Scale can improve performance for key scenarios frequently experienced by high-volume WebSphere Commerce customers. We will discuss how eXtreme Scale can potentially reduce the impact of a full or partial site restart for end users. We will also discuss the cache invalidation scenarios needed by retailers to reflect catalog updates, and consider how eXtreme Scale potentially improves the end user experience during these events.
WebSphere Commerce and WebSphere eXtreme Scale integration
WebSphere Commerce V7.0 Feature Pack 1 supports integration with eXtreme Scale V7.0. eXtreme Scale V7.0 supports many caching topologies. However, not all of them are supported by Feature Pack 1. WebSphere Commerce support is focused on the eXtreme Scale dynacache plugin component as a configuration choice for WebSphere Commerce customers. eXtreme Scale is not included as part of Feature Pack 1.
Figure 3 shows the type of topology supported by Feature Pack 1:
- The eXtreme Scale grid is installed on a separate physical machine or partition (for example, LPAR) from WebSphere Commerce to avoid CPU, memory, and network contention.
- IBM Power™ and AIX customers can install WebSphere Commerce and eXtreme Scale on different LPARs. For customers using Power virtualization features, such as micropartitions, consider the CPU, memory, and network requirements for each LPAR. Sharing critical system resources among LPARs may have implications for performance.
- WebSphere Commerce communicates with eXtreme Scale via the eXtreme Scale dynamic cache provider.
- Consider possible optimizations for WebSphere Commerce and eXtreme Scale placement, such as sharing the same frame (p-series architecture) to further reduce network latency (shown in Figure 3).
- We recommend that eXtreme Scale JVMs run in different LPARs than those containing WebSphere Commerce servers.
Figure 3. WebSphere Commerce and eXtreme Scale topology supported by WebSphere Commerce V7.0 Feature Pack 1
Optimizing site recovery time
WebSphere Commerce sites, as with most IT assets, may need to take components offline periodically for maintenance. Likewise, the site may want to keep components in reserve for failover or peak events when the site might need additional capacity.
Bringing cold components into a working site presents an interesting challenge when caching is involved. The component (typically an LPAR, JVM, or application) performs optimally when the cache is warm. However, if the component was offline, the cache may have been lost entirely or become significantly stale.
This requires the component to work harder until the cache reaches its optimal state. During this period, the component may not support its full capacity as it uses extra resource, such as mid-tier CPU and database capacity, to generate responses that will eventually reside in the cache.
Cache warm-up after a cold-start is particularly problematic if the component is immediately thrust into a heavy workload requiring the component to respond at full capacity. For example, if a JVM restarted during a Black Friday sales event, it becomes active in a farm handling a peak load for the year.
Improving startup performance under heavy load
As discussed earlier, restarting parts of a WebSphere Commerce site under heavy load presents special challenges. Without a populated cache to boost the site performance, end users who are sent to the newly started portion of the site might experience response time degradation. Also, the mid-tier CPU and the database would see increased utilization.
It is important for the retail site to rebuild the cache and bring site response times and utilization to a steady state quickly.
The key difference between a dynacache-based topology and the eXtreme Scale-based topology is that the latter has only a single cache instance that is populated by multiple application server JVMs. WebSphere Commerce using eXtreme Scale means the site shares a single cache, reducing the start up times.
In this example, the team tested an extreme situation: restarting the entire site prior to releasing a full load against it. This simulates some failover scenarios involving active-passive datacenter concepts.
Figure 4 shows the data comparing a traffic surge to a newly-started site using WebSphere Commerce with traditional dynacache versus WebSphere Commerce with the new eXtreme Scale central caching capability. The red arrow indicates the point where a steady state is considered to be reached. Spikes in I/O are produced by garbage collection of cached objects stored in the disk offload file.
Figure 4. Restart CPU utilization with dynacache
Figure 5 shows CPU utilization on the WebSphere Commerce machine for a restart scenario with eXtreme Scale. The disk I/O activity is at a low level with minor spikes due to Application Server logging. The red arrow indicates the point where steady state is considered to be reached.
Figure 5. Restart CPU utilization with eXtreme Scale
Overall in our cold start tests, we observed that with eXtreme Scale, the WebSphere Commerce site tends to reach steady state in about 60% of the time required for dynacache - almost twice as fast! In our tests, we used four WebSphere Commerce cluster members. A larger WebSphere Commerce cluster would likely see a bigger benefit.
Improvement in average response time is another notable benefit of employing eXtreme Scale to help with the cold start situation. This is shown in Figure 6.
Figure 6. Comparison of average response time for dynacache-based and eXtreme Scale-based topologies
The eXtreme Scale topology shows less statistical scatter in response times, and overall consistently faster response times. In our testing, we observed up to 25% improvement in average response times. This produces a better end user experience for the shoppers and will tend to improve revenue (due to fewer shoppers leaving the site) during major high-volume events.
Improving startup performance under light load and load ramp-up
This scenario is usually not associated with a high-volume event. However, it is still common for web retailers to have to bring down, maintain, and restart JVMs. When one of the JVMs is shut down, two important things happen:
- Live shopper sessions (remember in this scenario the site is under load) fall over to the other JVMs. These JVMs will need to take up the load that was being served by the JVM that is being maintained.
- WebSphere Commerce JVM loses its cache and disk offload file during restart. It is possible to flush the cache contents to disk, but this is not generally desired due to the fact that cache contents can become stale during the maintenance window. Also, there are performance implications around the "flush to disk on stop" approach.
Once the JVM is restarted after maintenance, the HTTP plug-in will route a proportion of requests to that JVM. The cache instance for that JVM is cold (not populated with cache entries). Given that the restart occurs under lighter load conditions, the system has sufficient resources to operate until the cache is warm. However, a proportion of shoppers whose requests are directed to this JVM by the HTTP plug-in may still experience slower response times until the cache is warm.
Figure 7 shows CPU and I/O utilization for the WebSphere Commerce server for both dynacache-based and eXtreme Scale-based topologies. For the WebSphere Commerce server running dynacache-based topology, you can clearly see a CPU and disk utilization disturbance. The WebSphere Commerce server running the eXtreme Scale topology remains virtually undisturbed by the JVM restart.
Figure 7. Comparison of CPU utilization and disk I/O for the WebSphere Commerce server with dynacache during server restart
Figure 8. Comparison of CPU utilization and disk I/O for the WebSphere Commerce server with eXtreme Scale during server restart
The response time curve for the eXtreme Scale-based topology is undisturbed. For the eXtreme Scale topology, all JVMs share a single cache instance that remains undisturbed during JVM restart. eXtreme Scale, therefore, brings significant benefit during a warm restart. Shoppers get a superior end user experience, and retailers are consequently able to realize higher revenue during operational procedures.
Optimizing invalidation performance
Another important set of scenarios reflects the fact that many web retailers need to periodically invalidate the contents of their cache. Caches need to be regularly invalidated for a number of reasons. The most common of these are:
- Need to display accurate inventory levels. Customers frequently have the WebSphere Commerce site integrated via a live feed with an ERP that stores inventory information.
- Need to display the most up-to-date price information.
These cache invalidation scenarios generally fall into two categories:
- Full cache invalidation: In this case, the web retailer chooses to periodically manually invalidate the entire cache under load and rebuild it.
- Partial, continuous cache invalidation: In this case, the web retailer has a live price or inventory feed that periodically invalidates a proportion of the cache.
We will now discuss the benefits that eXtreme Scale brings in both of these situations.
Invalidating the entire cache
Figure 9 shows a comparison of I/O rate on the database server for both the dynacache-based and eXtreme Scale-based topologies. Database I/O rate is an important indicator of overall site performance. As the cache is completely invalidated, all customer requests are sent to the database server for processing. The database server can bottleneck on disk I/O, resulting in reduced throughput and poor response times.
Figure 9. Comparison of I/O recovery during full cache invalidation scenario
When the cache is invalidated, the database I/O rate for both topologies rapidly increases to a high level. I/O rate for the eXtreme Scale-based topology recovers faster and settles at a lower level than for the dynacache-based topology. In the case of the eXtreme Scale topology, the single cache instance is populated by several JVMs. Once a particular cache entry is refreshed by any cluster member, then it is immediately available to all of the other cluster members. This causes the cache to become populated faster that in the dynacache case. Larger numbers of WebSphere Commerce cluster members will show higher degradation as they are all competing for the single database instance during this warm-up time.
Continously invalidating part of the cache
To simulate this scenario, we developed an algorithm that invalidated 25% of the cache entries every twenty minutes - similar to the effect that is caused by a live inventory feed. As with the previous scenario, where we invalidate the entire cache, the database server I/O rate is a good indicator of overall site performance. Figure 10 shows a comparison of database server I/O for both dynacache-based and eXtreme Scale-based topologies.
Figure 10. Comparison of database server I/O recovery with dynacache
Figure 11. Comparison of database server I/O recovery with eXtreme Scale
In our test, the overall I/O rate (yellow line in Figure 10 and Figure 11) for the dynacache-based topology remains consistently high at about 800 I/O operations per second. The test site remains bottlenecked on the database server I/O and bound by performance characteristics of the database server disk.
The database I/O for eXtreme Scale-based topology is able to recover and drop to a much lower level of about 300 I/O operations per second. The test site is bottlenecked on database I/O for short periods of time following each partial invalidation event, but recovers quickly. Again, this good result is due to the fact that with eXtreme Scale, multiple JVMs work to populate a single cache instance. The cache is populated faster, allowing the database server I/O rate to relax.
WebSphere Application Server dynacache with disk offload can provide WebSphere Commerce sites with excellent performance characteristics. There are, however, a number of important advantages to deploying WebSphere eXtreme Scale together with WebSphere Commerce. These include:
- Potentially up to 25% reduction in average response time in many scenarios.
- Less statistical fluctuation in response time produces a more consistent end user experience.
- Potentially up to 40% improvement in time to reach steady state after full or partial site restart, or after full cache invalidation.
- Simplified tuning and operational maintenance due to the fact that with eXtreme Scale, you do not need to worry about tuning the size of the disk offload file or file system cache performance.
- Reduced I/O volume to high-speed disk - eXtreme Scale keeps data in RAM.
- Coherent and consistent cache. With eXtreme Scale, only one version of
a cache entry is cached:
- Same version of the page is always shown.
- Pages and page fragments are invalidated only once, rather than once per JVM.
- Edge-caching with Akamai is facilitated due to the fact that each page or page fragment will have the same last-modified date.
WebSphere eXtreme Scale works best for caching WebSphere Commerce pages, page fragments, and commands. For best performance results it is best not to cache distributed maps in eXtreme Scale.
Some customers are especially likely to benefit from integrating their site with eXtreme Scale:
- High-volume customers with large numbers of WebSphere Commerce JVMs and large caches will see the most benefit.
- Best results for cached pages and page fragments.
- Customers with a lot of WebSphere Commerce JVM instances and large clusters.
- WebSphere Commerce customers performing a high volume of cache invalidations due to promotions, inventory, or price feeds.
- WebSphere Commerce customers who are unable to leverage DRS for cache invalidation due to network bottlenecks.
Customer results may vary due to differences in scenario and hardware environment. We would like to encourage customers to evaluate adding WebSphere eXtreme Scale to their WebSphere Commerce site.
- WebSphere Commerce High Availability and Performance Solutions is a comprehensive IBM Redbook that describes key aspects of WebSphere Commerce performance.
- IBM WebSphere eXtreme Scale V7: Solutions Architecture is an IBM Redpiece that describes key eXtreme Scale architecture concepts.
- Learn about WebSphere Commerce in the WebSphere Commerce V7 Information Center.
- Learn about WebSphere eXtreme Scale in the WebSphere eXtreme Scale Information Center.