IBM Workplace Collaboration Services provides a wide range of powerful, integrated collaboration features such as email, calendar and scheduling, presence awareness and instant messaging, online learning, team spaces, Web conferencing, and document and Web content management.
During the development of Workplace Collaboration Services 2.5, one of our key considerations was performance. We wanted to ensure that in its early releases, Workplace Collaboration Services was built to deliver high performance and reliability throughout its future releases. In this first of a two-part article series, we explain the process we used to accomplish this goal. We will share with you the enhancements we made to our standard product development process that helped us produce and release a high-performing and scalable product. We offer detailed steps and methodologies that you can incorporate into your own application development processes. Our goal (in addition to informing you about the rigorous testing processes we perform on our products) is to present procedures and steps that have proven successful for us, and that you can adopt and adapt to your own software development needs. In the second article of this series, we'll share the development best practice experiences that had the most impact in reaching our goals. These best practice experiences may also help you in your J2EE-based application development efforts.
This article assumes you're an experienced product manager, architect, application developer, tester, or performance engineer, ideally with some familiarity with Workplace Collaboration Services or other IBM Workplace products.
A very critical element of any product development practice is the methodology that defines the entire product development cycle. Figure 1 shows the high-level steps that product development teams (including test and performance) generally follow within IBM (and that also represent good product development practices in the software industry):
Figure 1. Standard product development process
The high-level workflow illustrated in Figure 1 shows the steps and phases necessary for successful product delivery and release. The timeline for the workflow runs from left to right, where the first step is to understand the product business requirements and overall product goals. After these are identified and agreed upon, the requirements are broken down into various use case scenarios and functional requirements that our customers will leverage in their production environments. When these have been designed, reviewed, and accepted, coding begins.
As Figure 1 shows, as each requirement is defined, there's a corresponding verification step where test strategy, design, and execution is performed. Associated with coding is unit testing, which is most often performed by the developers themselves before integrating the code into the current code base. Unit testing verifies the internal design, tests the function of programs and units of code, and tests paths and conditional coverage. The functional test team develops and executes a set of tests to verify that the use case scenarios can be successfully completed. These efforts include ensuring that the system meets the user's requirements, that the system performs its function consistently and accurately, and that the program logic of a component performs according to specifications. This is more limited than systems testing in that it focuses only on whether or not a component works correctly in an isolated configuration.
Systems testing focuses on all components on all platforms in an integrated environment. Its objective is to ensure that the system functions together with all the components of its environment. Evaluations are performed with the following criteria in mind: integration, interoperability and RAS (reliability, availability, and serviceability). For maximizing effectiveness of efforts, systems testing doesn't start until a high percentage of functional tests have been completed.
At around the same time as systems testing is done, performance testing and analysis (not shown in Figure 1) gets underway. Performance testing verifies that the applications can deliver the required response times, transaction throughput, and resource utilization, all within acceptable boundaries. The objectives of performance testing include setting product performance goals, advocating design optimization, and quantifying product performance and scalability.
Based on our product development and release experiences, during the development of Workplace Collaboration Services 2.5 we made it a priority to perform performance analysis and verification earlier in the product development cycle. Moving the recognition, implementation, and verification of performance goals earlier in the cycle enabled the product development team to achieve more success points towards our business requirements.
To do this, we enhanced our standard testing procedure to include performance analysis and code-based corrections much earlier in the product development cycle. The resulting process is shown in Figure 2:
Figure 2. Enhanced product development process for Workplace Collaboration Services 2.5
This enhanced approach took the original product development methodology and "pushed down" relevant and applicable performance analysis and correction steps earlier in the cycle. Obviously, finding code and product issues earlier in the cycle has a large overall payoff in terms of reduced development time and costs. The same holds true for performance analysis, where relevant analysis and correction steps executed earlier in the cycle help detect and resolve performance issues.
The overall "philosophy" of this approach can be stated as follows: start small by analyzing specific areas of code, and gradually build to a complete system configuration with multiple simulated users. To do this, we:
- Used a unique toolset that simulated user workload execution and captured various types of system information.
- Ensured that end user response time, throughput, and system utilization were within previously established performance requirements.
- Reviewed database and directory services (LDAP) integration as part of an entire system, including constant memory allocations or memory allocations not freed over time.
- Addressed "garbage collection" problems to reduce allocated bytes for each transaction (less held and less to release).
- Watched for block waits for directory services and database (wait states can have a significant impact on performance).
- Monitored consumption of system resources.
The following sections describe the steps involved in our enhanced Workplace Collaboration Services 2.5 development process.
Performance and "single click" metric requirements
As Figure 2 shows, for Workplace Collaboration Services 2.5 we started with a set of performance requirements that the product needed to achieve. This process is similar to defining business requirements, the first step shown in Figure 1, except that the focus is on product performance.
Next, we made a large adjustment to the use cases step, resulting in a step we called "single click metric requirements." "Single click" refers to typical user actions initiated by a single click from the user interface. Examples include saving a calendar appointment, joining a Web conference, and creating a chat room. (We discuss single click metric analysis in more detail later in this article.)
Coding and performance unit testing
As with our standard procedure, the next step involves coding. For Workplace Collaboration Services 2.5, we enhanced the unit test process to facilitate performance data capture and analysis. This included enhancing the unit test process to collect key metrics as part of test execution. We could also develop specialized tests for developers that executed key lower-level functions.
After the code was considered sufficiently developed and stable, we began the profiling process. The code did not have to be declared "code complete" for this step to begin. It just had to have a sufficient amount of the framework and logic in place to be successfully executed and reflect the expected functionality. Profiling involves using tools to execute and analyze the code against the set of single click requirements defined earlier. Developers executed the identified use cases, and with the support of a code-profiling tool, captured and analyzed the feature capabilities.
Key features of our profiling tools included:
- Ability to capture totals from an entire transaction in an end-to-end view, enabling comparison across feature areas.
- CPU breakdown of the transaction.
- Trace directory services and database calls, with the ability to review the number of calls to a given method.
- Breakdown of memory usage per user, with allocated bytes vs. live bytes (memory utilized after allocation).
- Fast response to requests.
To ensure "apples-to-apples" comparisons, we used the same tools across our teams. Also, it was important to have the same tool used across the development and performance teams to share information in a common and familiar format.
One of the distinguishing features between profiling and integrated performance testing (see the following section) is that we analyze the characteristics of a single user in a base Workplace Collaboration Services environment.
Integrated performance testing
The final step in our Workplace Collaboration Services 2.5 development process was integrated performance testing. This is similar to systems testing described earlier. Integrated performance testing involves working in a complete system configuration (modeling an example production site) with a substantial and varied body of content that included instantiated users, buddy lists, team spaces with data content, and so on, along with simulated user activities that mimic the actions of thousands of users.
The performance test team used the profiler tool to analyze performance bottlenecks identified during testing. The team also verified that applications and features can deliver the required response time, throughput, and resource utilization within agreed-upon limits. This gives us opportunities to improve the final product through both integrated performance testing and single click analysis.
As we mentioned in the previous section, single click actions involve users performing a particular activity, such as sending an email message. Defining how these single click actions need to perform is a critical step in the overall performance testing process. To do this, we needed to establish the following:
- Metrics to collect. These metrics were identified as part of the performance requirements phase and laid the groundwork for the entire testing effort. The metrics were based on key areas in the current product that we considered "vulnerable." To address these potential issues, we decided to focus on reduced instruction count and byte allocations.
- Product goals. Associated with metric definition was setting the goals we wanted the product to achieve. These included number of directory services calls, number of database calls, amount of memory used, and path length (number of instructions executed). For example, we set the limit of allowed database calls to be a manageable number, and then compared that number to actual product performance. We also set the limit of directory services calls to a very low number because waiting on I/O (input/output) can be costly and we wanted to be sure we didn't overwhelm existing shared resources in a customer's production environment. Setting a limit to the instruction count will result in less CPU utilization and better throughput. Memory is another costly resource, both in allocation and later in cleanup, so ceilings were set on the amount of memory that could be allocated per single click analysis.
- Resources to draw from. We created a virtual team of developers with representation from each feature area. They were able to start the effort of execution and analysis when the code features were reasonably complete and usable. They continued their efforts through the start of systems testing, when the goal was to start achieving stability in the code base by restricting changes.
- Scope of analysis. The Workplace product family is a large one, spanning many development teams. With the tools provided (and known customer key paths), feature representatives were asked to select a manageable number of transactions to monitor for their areas. We set the limit at approximately five transactions for analysis of each feature area.
- Methodology. We needed to decide how to deal with "warm" vs "steady state" costs of the system. The general methodology followed by all teams was to log in, log out, log in, and then measure. This is considered a best practice approach.
-
Information organization. We set up a central location -- a Notes database -- for analysis results storage. The product team had access to this database to review results. Various colored flags were developed to visually illustrate how each measured metric compared to its goal. We ran weekly tests and posted updated results in this database. We also needed to implement a way to report our observations and improvements. To do this, we designed a chart for tracking and reporting. Figure 3 is an example:
Figure 3. Performance chart
Figure 3 shows the four metrics that we tracked for each single click analysis: number of instructions, memory, number of directory services (LDAP) calls, and database calls. Each of these metrics is plotted on its own axis. In our charts, green circles represent starting baselines, red circles represent target goals, and yellow circles show the actual metrics measured. By comparing the positions of yellow circles to red ones, we can see at a glance how the product is currently performing compared to our goals.
Benefitting from our experience
Our primary goal in writing this article is to offer ideas and suggestions that you can incorporate into your own application development efforts. For example, we highly recommend that you establish performance metrics and goals (including the priority and allocation of resources) as early as possible in the development cycle. This includes defining the requirements to be achieved and ensuring that all teams agree to these goals. And make sure resources are allocated early in the cycle to define the use cases for single click analysis.
Also, consider which tool(s) to use for profiling. Many different profiling tools are available, depending on your code platform. IBM Rational offers a number of tools to meet these needs. Other tools include JUnit and ArcFlow. If profiling tools are not available to you, consider other ways of obtaining test results. For example, you can use a timer and outputs to a text file as a stop-gap measure.
In conclusion, we'd like to leave you with the following key points:
- Finding performance issues earlier in the cycle greatly benefits the entire development cycle (and ultimately the finished product itself).
- Allocating more time to single user analysis is also beneficial.
- Don't wait for the application to be considered done before starting performance analysis.
- We greatly accelerated the time required to reach our goals and enabled the integrated performance test team to achieve additional product performance and scalability analysis points.
- Don't fix anything you don't know is a problem.
- You can expect multiple payback points from your efforts, beyond the current development project.
The process we used for developing and testing Workplace Collaboration Services 2.5 has proved very successful, and will remain part of our development process for our upcoming releases. But as with any process, we're always looking for ways to improve it. For instance, we are in the process of performing an end-of-cycle analysis, reviewing the problems reported and determining which are future candidates for "functional pushdown" and inclusion in the next product release. We're also reviewing the current list of single click metric use cases to determine whether they need continued analysis or can be deprioritized to make room for other analysis efforts. And we plan to update the list of single click metric use cases based on upcoming new product features. In addition, we will work with other product development teams on methodology and improving their product delivery results.
We hope you have found this article useful. In part 2 of this series, we'll look at several best practices that can help you plan your J2EE-based application development and testing.
- To learn more about IBM Workplace Collaboration Services, see its product page.
- Visit the IBM Rational products page to read about IBM Rational profiling and performance tools.
- Read the developerWorks: Lotus article, "New features in IBM Workplace Collaboration Services 2.5," to learn about the new features provided in IBM Workplace Collaboration Services 2.5, the latest addition to the IBM Workplace product family.
- For a complete list of supported platforms for Workplace Collaboration Services, see the IBM Workplace Collaboration Services documentation.
- Get involved in the developerWorks community by participating in
developerWorks blogs.
-
Browse for books on these and other technical topics.
Carol Zimmet, Lead Technical Analyst on IBM's Workplace, Portal, and Collaboration Performance Team, is striking a balance between managing multiple projects and developing performance analysis skills on the new Workplace platform. She continues to evangelize on behalf of performance, both promoting product accomplishments as well as advocating on behalf of our customer requirements. Previously, Carol served on the development and test teams for many different Lotus products.




