Meet the experts: Stacy Joines and Gary Hunt on WebSphere performance

This question and answer article features two WebSphere performance experts, Stacy Joines on WebSphere performance, and Gary Hunt on WebSphere Business Integration Server Foundation V5.1 and WebSphere InterChange Server.

Share:

Stacy Joines (joines@us.ibm.com), Senior Technical Staff Member, IBM

Photo: Stacy JoinesStacy Joines is a Senior Technical Staff Member and the performance lead with the IBM Software Services for WebSphere consultancy. She has 17 years of experience in software development, and specializes in performance consulting within the WebSphere product line. Stacy is co-author of Performance Analysis for Java Web Sites.



Gary Hunt (ghunt@us.ibm.com), Senior Consulting IT Architect, IBM

Photo: Gary HuntGary Hunt works in the IBM Software Services for WebSphere Performance Focused Technology Practice as the expert on WebSphere InterChange Server and WebSphere Business Integration-related performance areas. In his current role, Gary has has been involved in numerous performance-related customer engagements.



04 August 2006 (First published 06 July 2005)

Introduction

In this article, WebSphere® performance experts answer your questions on WebSphere V5 and V6, WebSphere Business Integration Server Foundation V5.1, and WebSphere InterChange Server performance practices. Topics include high-performance topologies, design and implementation strategies for high performance applications, testing and capacity planning, and general performance best practices.

The article starts with questions for Stacy, then questions for Gary:


Stacy on WebSphere V5 and V6 performance

Question: What are some of the key tuning parameters for WebSphere Application Server?

Answer: This is a common question, and the advice always begins with, "Before tuning anything, baseline the system against the out-of-the-box settings." Two reasons for this are:

  • The out-of-box tuning settings are pretty good and work well for many applications.
  • If you make tuning adjustments later, the baseline tells you whether you're making progress or regressing after the changes.

Assuming you baseline the system and it doesn't match your requirement or expectations, what next? Before you begin tweaking parameters inside WebSphere Application Server (hereafter called Application Server), it's time well spent to find out why the system is not performing well.

Application Server comes with several tools to help you determine what's really happening inside the Java™ Virtual Machine (JVM). The Tivoli® Performance Viewer (TPV), included free with Application Server, allows customers to monitor key resources such as the JVM, the Web and EJB containers, and remote connection pools. It is a handy resource for determining how your application is using the resources available, and can also provide information pointing to misbehaving remote resources. For example, if the database connection pool shows wait times to obtain a connection, it may mean the remote database is overworked, or you need to tune the database, or maybe you simply need to add a few more connections to an undersized pool. Likewise, Application Server provides the Runtime Performance Advisor feature, which gives the administrator feedback on potential tuning issues in the system.

For more complex tuning issues, you might turn to a monitoring tool, such as those provided by Tivoli and other vendors. These tools do more analysis of the system at runtime, and can support in-depth bottleneck analysis.

The WebSphere Application Server Information Center contains a list of tuning parameters, and depending on your version, may also include a tuning parameter "hot list" that describes the most commonly adjusted parameters. This is a good resource for understanding the basics of adjusting memory and resources pools within the product.

Some other excellent resources include IBM® Redbooks, such as the WebSphere Application Server V6 Scalability and Performance Handbook. Also, you might want to check out IBM WebSphere: Deployment and Advanced Configuration, which contains some performance tuning advice.

Finally, don't lose sight of the application's impact on performance. Before moving into full performance testing, consider using a code profiling tool to analyze the application for potential inefficiencies and bottlenecks. IBM Rational® and other vendors provide such tools to help you observe the application's behavior and to find potential trouble spots.

Question: We are looking at choosing the most appropriate platform for deploying a J2EE application with WebSphere Application Server. We are looking for the most appropriate models of the hardware available. Are there any benchmarks which we can refer to, to get these details?

Answer: WebSphere Application Server runs on a variety of platforms. Selecting the right one for your deployment depends on many factors, including available platform-specific skills in your organization and your long-term growth plans. Also, as you plan for capacity, consider any requirements for failover or high availability.

Your IBM account team has access to the IBM Techline service to help with initial capacity planning. The customer fills out a detailed questionnaire describing their requirements (throughput, user visits, special application features, and so on). The Techline team provides a rough estimate of the hardware required to satisfy those requirements. It is a very handy service, and keeps the customer from trying to map benchmark results to the specific needs of their applications.

Of course, any capacity plan is only as good as the information it is fed. If the anticipated peak page requests are significantly more or less than specified in the questionnaire, for example, this impacts the hardware the site requires. Also, a capacity estimate is no substitute for performance testing prior to deployment. Many factors (application code, remote resources such as databases and LDAP servers, the network itself, and so on) impact performance and capacity. Performance testing well in advance of deployment, and production monitoring afterward, provide the best information for gauging actual capacity.

Question: For WebSphere Application Server, what are the realistic limits on the cluster membership, such as the number of application servers in the cluster, for a site using HttpSession replication?

Answer: WebSphere Application Server does not have hard limits on the size of clusters. Also, there is no ideal topology that fits all situations. However, here are some practical issues to keep in mind as you design your deployment topology.

For larger clusters, consider tuning services such as the WebSphere Application Server Data Replication Service (DRS) to reduce the overall traffic generated between application server instances. If your applications use these services, large data flows or high volumes of traffic will increase network traffic as instances sharing information increase. However, these services are highly tunable, and some options (such as push/pull) reduce the overall data flows between instances when they can be applied. This allows more instances to share information efficiently.

Likewise, for instances sharing HttpSession persistence data, the HttpSession service is rich with tuning options to reduce the amount of data passed between the application and the persistent store. Time-based writes, saving only updated session information, and other options reduce the data flowing between the application and the persistent store. Also, if using the memory-to-memory persistence option with DRS, you might experiment with push/pull and client/server to determine which gives your site the best performance. Of course, reducing the overall size of your HttpSession objects is a highly effective tuning technique.

Beyond data sharing and network traffic considerations, you must consider manageability in the cluster design. Is the site so large (in terms of nodes, applications, and so on) that the administrator is finding it difficult to accommodate everyone's needs? Do you have a plan for rolling out maintenance and new releases to the site safely and without impacting other applications?

Also, the 6.0 product release introduced the new WebSphere Application Server High Availability (HA) service. This functionality allows the members of a cluster to work cooperatively to maintain the availability of the cluster if one member experiences an outage. Keep in mind the impact to this collaboration as your cluster size grows and/or your network burden increases.

Question: How many application server instances should I place on one node?

Answer: If you want your applications to perform, the application server instance's JVM should not swap out of physical memory. So, if you plan to start more than one application server instance per node, the physical memory of the node should be sufficient to contain the maximum heapsize of all the application server instances it hosts, native heaps, other applications, and the operating system.

Also, CPU is a consideration. Deploying two CPU-hungry applications to the same box may create problems. To perform well, give your important applications to the CPU. If you must share resources with multiple applications, consider WebSphere Application Server Extended Deployment (XD) to help you manage resources to support your performance requirements for your priority applications.

Question: Where can I get more information on 64-bit WAS performance?

Answer: You can refer to WebSphere Application Server and x86-64 platforms: 64-bit performance overview that was recently released by the WebSphere Application Server performance team.


Gary on WebSphere Business Integration Server Foundation V5.1

Question: It used to be easy to find a list of MQSeries reason codes with explanations and resolutions. I can't find them now. Where is it?

Answer: You can find them in the following documents or Web sites:

Question: Regarding WebSphere Business Integration Server Foundation, what business process implementation alternatives have the biggest effect on performance?

Answer: Since WebSphere Business Integration Server Foundation (hereafter called Server Foundation) has much higher throughput for microflows (non-interruptible business processes) than macroflows (long running business processes), implementations that use microflows have better throughput capabilities than implementations that use macroflows.

Obviously, there will be cases where long running business processes are required. For such cases, implement any "sub-processes" as microflows, if possible. The parent macroflow should call the "sub-process" microflow as a child process.

Question: Where can information be found regarding tuning WebSphere InterChange Server?

Answer: The WebSphere Business Integration performance team has written about tuning parameters with the biggest effect on performance when developing WebSphere InterChange Server releases. See their write up in Introduction to WebSphere InterChange Server V4.2.2 Performance Tuning.

Question: What are the most important tuning parameters for WebSphere InterChange Server?

Answer: The most important tuning parameters and solution characteristics of a WebSphere InterChange Server based integration solution are poll frequency and poll quantity, event sequencing, target application throughput, JVM heap size, and speed of the disk subsystem. Each of these aspects are described below:

  • Poll frequency and poll quantity on the source adapter - Poll frequency and poll quantity control how fast events are passed into the server portion of WebSphere InterChange Server (hereafter called InterChange Server) from the source side adapter. The default of these parameters for most adapters yields a throughput of 2.5 events per second, which is below the throughput requirement of most high throughput integration interfaces. The throughput requirement of an interface should be established, the throughput capabilities of an interface should be measured, and poll frequency and poll quantity should be set appropriately to match the capabilities and requirements of the integration interface.
  • Event sequencing - Event sequencing is a function in InterChange Server that ensures that events are processed in a certain order. However, certain situations can exist with event sequencing that can put severe limits on performance. For example, if event sequencing is enabled, but all event IDs are identical or missing, the effect is that InterChange Server has been instructed to process events one at a time in a single threaded fashion. If maximum performance is required, be sure to confirm the business requirement for event sequencing. If event sequencing is required, make sure that event IDs are supplied that allow InterChange Server to process events in parallel as appropriate.
  • Target application response time and threading - In InterChange Server, collaborations wait for the response from the target application. As a result, the throughput capability of an integration interface is strongly tied to the throughput capability of the target application. In fact, the throughput of most integration interfaces are limited by the throughput capability of the target application.

    Throughput is affected by two aspects: response time and parallel processing capabilities (threading). In general, a request made by an integration interface to a target application is fairly complex and can take seconds to process. If a request takes exactly one second to process by the target application and the target application can only process one transaction at a time, the associated integration interface will be limited to a throughput rate of 1 event per second. You can improve response times by increasing the speed of host systems or performing code refactoring to improve efficiency. However, such activities typically only yield small percentages of improvement (around 10%). Improvements in threading of the target application can improve throughput by multiples. For example, if the previous target application was enhanced to process five requests at the same time, throughput is improved 5 times to 5 events per second. Thus, efforts to improve threading capabilities typically yield larger improvements in throughput than response time improvements.

  • JVM heap size - The rule of thumb for Java heap size configuration is to run the heap at less than or equal to 50% utilization. This configuration provides a balance between how often the garbage collector runs and how long it takes the garbage collector to collect garbage. More specifically, if the Java heap is too small, the garbage collector will not take long to run, but will run more often than it should. If the Java heap is too large, the garbage collector will not run very often, but it will take longer when it does run.

    You should monitor the size of the Java heap and the level of heap utilization. Adjust the heap size parameters that control the minimum and maximum heap size so that the Java heap runs around 50% utilized. This creates the best balance between the rate and length of garbage collection cycles.

  • Speed of disk subsystem - InterChange Server makes extensive use of WebSphere MQ and a relational database. It is important that you tune these components for the best possible performance. Both WebSphere MQ and DB2® components recommend that they run on a high speed disk subsystem for best performance. Also, any portions of a solution that executes single threaded and makes disk accesses, either directly or through WebSphere MQ or the database, will be significantly effected by the speed of the disk subsystem. For example, if InterChange Server listeners are configured for single threading, this portion of InterChange Server can easily become a performance bottleneck if the host disk subsystem is not as fast as possible. Minimally, the host disk subsystem should have RAID capabilities to provide the required disk system speed.

Question: What is the suggested approach for resolving a performance problem in WebSphere InterChange Server?

Answer: InterChange Server consists of a series of components that work on events that are queued immediately in front of the component. In general, the bottleneck of an InterChange Server based solution is found by monitoring these queues for "build-up" of events. The component that is the bottleneck is the component immediately following the queue that is experiencing build-up.

Here is a list of the InterChange Server components and a general recommendation for remediation of observed bottlenecks:

  • Source adapter agent - Source adapters are configured to monitor an incoming work queue and pass incoming events to the server. If the source adapter agent is a bottleneck, events will build up in the work queue that the agent is monitoring. For example, in the case of the JDBC adapter, this queue will be the "event table", and in the case of the MQ adapter, this queue will be an MQ queue.

    If such build-up is observed, adjust the poll quantity and poll frequency parameters so that the agent passes events to the server at a higher rate. If the agent is not keeping up with the setting of poll frequency and poll quantity, deploy a second source adapter if there is not a requirement for event sequencing.

  • Listener (source side) - Listeners monitor the transport queue and pass incoming events to the source side adapter controller. If WebSphere MQ is used as the source side transport, the transport queue will be a WebSphere MQ queue. If events are observed to be building up in the transport queue, but not in the queue in front of the source side adapter controller or the queue in front of the collaboration thread pool, then increase the number of listener threads. The listener can be a bottleneck if it is single threaded, and if InterChange Server is not running on a high speed disk subsystem.
  • Source adapter controller - The source adapter controller has an in-memory "mailbox" that it monitors to pick up events and passes them to the corresponding collaboration. You can monitor this "queue" using the InterChange Server Web-based monitoring tools. If events are building up in this queue, but not in the queue in front of the collaboration thread pool, then increase the number of threads configured for the source adapter controller.

    If events are building up in the transport queue (in front of the listener), increase the size of the thread pool for both the listener and the source adapter controller.

  • Collaboration - Each collaboration has an in-memory "mailbox" that it monitors to pick up events and process. You can monitor this queue by viewing the collaboration statistics using the InterChange Server System Manager. If events are building up in this queue, increase the number of collaboration threads and target application threads.

    Note: Since a collaboration waits for the response from the target application, do not set the number of collaboration threads higher than the number of target application threads. If events are building up in a collaboration's queue and the collaboration is configured with the same number of threads as the target application, the solution is running at its maximum throughput. Adjust the poll frequency and poll quantity to limit the rate at which events are passed into the end-to-end solution. In this case, increase the throughput of the target application to increase the throughput of the solution.

Question: What is the best way to monitor heap utilization in the JVM?

Answer: The best way to monitor heap utilization in the Java Virtual Machine (JVM) is to enable verbose garbage collection (verbose gc) output and to analyze the results. In recent versions of WebSphere Application Server, the administrator enables verbose gc by selecting the verbose gc check box in the administrator's console, and restarting the JVM. See the WebSphere Application Server Information Center for details specific to your version.

The IBM JVM also supports detailed garbage collection tracing via the -Xtgcn option, where n specifies the tracing level of interest. Enable tracing parameters by specifying these properties as command line options in the JVM Process panel of the WebSphere Application Server administrator's console. Note that JVM tracing often produces copious output, and is not usually recommended as a normal runtime setting, but is useful for problem determination.

For more information on IBM JVM settings, see IBM JVM Diagnostic Guides. These informative guides are a must-read for anyone involved with JVM tuning or diagnostics. They explain the parameters available, discuss how to read and interpret verbose gc output, and provide helpful insight on how the JVM functions. They also describe the garbage collection trace settings. If you're wondering what a pinned object really is and why you might care, these guides are an excellent starting point.

Verbose gc output reports many key garbage collection statistics, including the current heap size, the size of free space (at gc time), the size of currently referenced objects, the amount of memory released during the garbage collection cycle, the time in garbage collection, and the frequency of garbage collection. These garbage collection statistics often enable the detection of various memory issues, including memory leaks, large object issues, frequency of garbage collection, and heap fragmentation.

The best way to analyze the output of verbose gc is to produce a graph from the output. WebSphere Application Server provides some basic JVM size graphing via the Tivoli Performance Viewer (TPV). These graphs function independently of the verbose gc setting. Also, many tools exist to graph verbose gc output. We'll mention two such tools available on IBM Alphaworks that you can use to graph verbose gc from the IBM JVM:


Resources

About Meet the experts

Meet the experts is a monthly feature on the developerWorks WebSphere Web site. We give you access to the best minds in IBM WebSphere, product experts who are waiting to answer your questions. You submit the questions, and we post answers to the most popular questions.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=88115
ArticleTitle=Meet the experts: Stacy Joines and Gary Hunt on WebSphere performance
publish-date=08042006