The objective of the Linux Web serving testing effort in the IBM Linux Technology Center (LTC) is to uncover Linux kernel defects. The emphasis is on workloads relevant to real-world enterprise user environments using Web servers/application servers, and on improving Linux kernel stability, scalability, and compatibility with Web servers/application servers. Identification of defects of Web servers and application servers is not the primary focus.
There are two major servers available for Web serving: the Web server and the application server. I refer to them collectively in this article using the term Web serving.
- A Web server serves pages viewed in a Web browser by processing the requests over the HTTP protocol.
- An application server is a generalized server that exposes business logic to client applications through various protocols, possibly including HTTP. It provides more complex and powerful functions than a Web server does, such as session management, load balancing, messaging, transaction management, security, etc. In some sense, an application server is a superset of a Web server.
We selected quite a few Web servers and application servers for our Linux kernel testing environment, including Apache, Jakarta-Tomcat, IBM® WebSphere® Application Server, and Jboss. Most of these are open source projects and can be downloaded for free (see Resources for links to more information on these servers).
The testing effort on the 2.5/2.6 kernel using Web servers and application servers as the test workload has been much more extensive than on the 2.4 kernel. In testing the 2.4 kernel, Apache and WebSphere Application Server were the only two servers used as part of integration-testing scenarios. The Web Performance Tool (WPT) was the major Web test tool used. The Web serving tests were executed when there was a major change in the kernel or as requested for software verification on a random basis.
In the 2.5/2.6 kernel testing, we have developed much more solid and complete test plans (see Resources for links to the 2.5 test plan and execution plans on SourceForge). The test scope, test methods, and test timeline are well defined in the plans. Web server and application server testing are widely used as test tools in integration test, focus test, and user simulation test.
In addition using more servers, we used several different Web client test tools, including WPT, Hammerhead, Httperf, and Pagepoker to simulate different types of user environments. All of the server and client tools were executed with a different duration (24 hours and 96 hours) against the latest available kernel on a regular basis.
Moreover, the test hardware was not limited to an Intel-based single-processor system. The tests were done on 1-way, 4-way, and 8-way IBM eServer™ xSeries® machines, as well as on a 64-bit IBM PowerPC® system. Kernel-related defects were opened in the Linux kernel bug tracking system.
Web serving plays a major role in the enterprise world. Significant improvements and changes have been made on the 2.6 kernel to favor enterprise applications. New hardware support, software support, and internal kernel improvements give the 2.6 kernel better scalability and stability. The 2.6 kernel performs much better than the 2.4 kernel under heavy load across a number of CPUs and a large amount of memory. Some of the key features in 2.6 that will benefit enterprise applications include:
Linux supports a wide range of hardware platforms. The 2.6 kernel supports new architectures, such as the 64-bit PowerPC, the 64-bit AMD Opteron, and embedded processors.
Hyper-threading, an innovation from Intel, is a major hardware enhancement supported by the 2.6 kernel. Basically, hyper-threading can create multiple virtual processors based on a single physical processor using simultaneous multi-threading technology (SMT); multiple application threads can be run simultaneously on one processor. To take full advantage of it, applications need to be multithreaded.
Hyper-threading offers many benefits to Web servers and application servers. It can increase the number of transactions that can be processed, provide faster server response time, and enable servers to handle larger workloads and more user requests. Currently, Intel Pentium 4 Xeon processors have hyper-threading hardware built-in.
NUMA (Non-Uniform Memory Access)
NUMA is another major feature that has been added in the Linux 2.6 kernel to improve system performance. In the traditional model for multiprocessor support (symmetric multiprocessing, or SMP), each processor has equal access to memory and I/O. The high contention rate of the processor bus becomes a performance bottleneck. The NUMA architecture can increase processor speed without increasing the load on the processor bus. In NUMA systems, each processor is close to some parts of memory and further from others. Processors are arranged in smaller regions called nodes. Each node has its own processors and memory; the nodes can talk to each other. It is quicker for processors to gain access to memory in a local node than in different nodes. Minimizing the inter-node communications can improve the system performance.
To support NUMA hardware, the Linux kernel adapts a series of enhancements in several areas, including the scheduler, multi-path I/O, a user-level API to let a user understand the allocation of processor and memory resources to be used, and internal kernel APIs to let the kernel subsystems understand NUMA topology. NEC Azusa, IBM x440, and IBM NUMA-Q are examples of NUMA machines.
In the 2.6 kernel, more types of devices have been supported. The 2.6 kernel has also expanded the limitation of the major number from 255 to 4095 and has allowed more than one million subdevices per type. This should give high-end enterprise systems sufficient support.
The 2.6 kernel adapts the new thread library, Native POSIX Thread Library (NPTL). This new library is based on a 1:1 model and full POSIX compliance. A test done by Red Hat indicates that on an old IA-32 dual 450MHz PII Xeon system, 100,000 threads could be created and destroyed in 2.3 seconds (with up to 50 threads running at any one time) using NPTL.
NPTL gives the kernel a major performance boost for multi-threading applications in an SMP environment. It is especially valuable for heavily multi-threaded enterprise level application, such as Java® applications, as well as Web server and application server applications.
Another threading improvement in the 2.6 kernel is that the number of PIDs that can be allocated has increased from 32,000 to 1 billion. The threading change improves the application-starting performance on heavily loaded systems. The 2.4 kernel sometimes suffers with higher numbers of PIDs requested by applications due to the low PID limit it allows.
The O(1) scheduler was accepted into the official Linux 2.5 kernel tree in 2002. The O(1) scheduler increases Linux scalability and overall performance by improving throughput with large numbers of processes, especially on large SMP. O(1) scales well with a large number of tasks and CPUs and has strong affinity, to avoid tasks bouncing between the CPUs. The O(1) scheduler also allows for load-balancing across CPUs and NUMA-aware load-balancing.
Block I/O Layer
The Block I/O Layer in the 2.6 kernel has been rewritten to improve kernel scalability and performance. The global I/O request lock in 2.4 has been removed. The block I/O buffer (kiobuf) in the 2.6 kernel allows I/O requests larger than PAGE_SIZE. Most of the problems that are seen are caused by the use of the buffer head and kiobuf and are addressed in the new layer. The I/O scheduler was completely rewritten. There are also major improvements that have been made on SCSI support.
Asynchronous I/O is new in the 2.6 kernel. It provides ways for enterprise applications such as Web servers and databases to scale up without resorting to complex internal pooling mechanisms for network connections.
In addition to these enhancements, there are some other remarkable changes and new features worth mentioning. For example, the 2.6 kernel provides support for several new file systems, including JFS, XFS, NFS v4, and the Andrew File System (AFS). New networking protocols and features such as Stream Control Transmission Protocol (SCTP), Internet Protocol Security (IPSec), improved IPv6 support, and IP Payload Compression (IPComp) provide Linux 2.6 kernel users better network security and transmission quality.
Not all of the enhancements provided by the 2.6 kernel will apply to each enterprise application. Some of them do have specific hardware and software requirements. However, most of the enhancements listed here are general kernel improvements that will help Linux break the enterprise barrier.
In this section, I will discuss how the Web serving tests were done, including the hardware environment, selected Web servers/application servers and Web test tools, and the testing strategy with typical test scenarios. The following discussion is based on the 2.6 kernel.
There were four Web serving servers used in Linux 2.6 kernel testing. Two were Web servers (Apache and Jakarta-Tomcat), and the other two were application servers (WebSphere Application Server and Jboss).
Apache is the market leader of Web servers. The Netcraft Web Server Survey found that more than 64% of the Web sites on the Internet are using Apache. It is an open source project.
Jakarta-Tomcat is an open source servlet container with a JSP environment available under the Apache license. Jakarta-Tomcat has a built-in Web server and can also be used with other Web servers in a production environment.
The WebSphere Application Server is an enterprise-level application server for dynamic e-business applications. The J2EE technology and Web services are the foundation of the server. The IBM WebSphere Application Server provides high performance and an extremely scalable transaction engine across most of the operating systems. More and more WebSphere applications are being migrated from a traditional UNIX operating system to Linux for lower costs with similar performance.
The Jboss Application Server is also an open source application server with a full J2EE personality. Started as an open source EJB container, Jboss is now targeted to become an enterprise-ready application server.
Quite a few Web test tools and benchmarks are available online. The following are the four open source tools we mainly used to simulate Web-client stress in our 2.6 kernel test environment (see Resources for links to more information on these):
- Httperf is a tool for measuring Web server performance. The Httperf tool can control the rate at which requests are issued, the total connection number, and the time-out limit.
- Hammerhead is a stress test tool designed to test Web servers. Hammerhead can initiate multiple connections from IP aliases and simulate numerous (256+) users at any given time.
- PagePoker is a Perl package that defines a browser agent with features for testing Web servers. PagePoker comes with three scripts for different uses, including multiple clients, stress testing, and benchmarking.
- Web Performance Tool (WPT) is an IBM-developed Web test tool.
In addition to the previous tools discussed for Web serving testing, IBM has a tool called Trade3, which is the WebSphere end-to-end benchmark and performance sample application. The Trade3 benchmark models an online stock brokerage application and provides a real-world workload driving WebSphere performance components and features.
The Web serving tests attempted to create user scenarios that viewed the system as a whole. The test duration started with 24 hours for the first run. The second run increased the time to 96 hours, with the third and fourth runs lasting seven days and 14 days, respectively. All the scenarios based on different combinations of server and client tools were executed on up to 8-way IBM xSeries and pSeries® servers. System utility monitoring tools were used to record the kernel stress level.
Figure 1 shows how several different test tools went to different Web servers or application servers. The different test tools tried to simulate different types of user environments.
Figure 1. Test environment
Figure 2 shows the stress test environment using the IBM WebSphere product and benchmark tool, Trade3, which simulates an online stock brokerage environment.
Figure 2. Stress test environment
The following sections represent a snapshot of the Web serving testing with typical scenarios we used on the 2.4/2.6 kernel. A typical Apache/WPT test on an 8-way SMP IBM xSeries system demonstrates the dramatically improved performance on the 2.6 kernel without impacting the service quality.
- Machine: IBM xSeries Netfinity 8500R 8681-7RY
- CPU: (8) Pentium III-700MHz
- Memory: 9 GB
- Swap space: 2 GB
- Linux distribution: Red Hat 7.3
- Web server: Apache Http Server 2.0.47
- Web test tool: WPT 1.9.4
- Two tests were done on the same system with the same configuration. The only difference was the kernel version.
- The automated Web test tool, WPT, was used to simulate Web clients. 30 virtual clients were created, and each client had two threads.
- Within the same duration, the Apache server served six times more Web pages on the 2.6 kernel than on the 2.4 kernel. The mean time for page processing on the 2.6 kernel was only 1/5 of that on the 2.4 kernel.
- No unsuccessful connections, including failed connection, early server closes, request write failures, and timeouts, occurred during either of the two 24-hour runs.
|Kernel||Average CPU utilization||Average memory utilization||Average swap utilization||Total Web page served||Page served per second||Processing mean time (millisecs)||Unsuccessful connections|
|2.4.18 -smp||100% (user:7.38% system:92.62%)||6.41%||0%||8,845,147||102.37||294.44||0|
|2.6.0 âtest5||99.42% (user:39.35% system:60.07%)||35.96%||0%||53,827,939||623.00||57.71||0|
Figure 3. Web pages served vs. time
We've shown that, using a typical test scenario (Apache/WPT on an 8-way SMP IBM xSeries system), the Apache server has better scalability and performance on the 2.6 kernel compared to the 2.4 kernel. On the same system under the same workload, the Apache server with 2.6.0-test5 kernel more effectively used system resources and served six times more Web pages than the 2.4.18 kernel did. This real data demonstrates that a variety of features and changes have helped the 2.6 kernel offer better scalability and performance and become more mature for enterprise-level applications.
The 2.5 test plan and execution plan are both available on SourceForge.
Defects found during testing were reported in the Linux kernel bug tracking system.
- Hammerhead is a Web server stress test tool.
Web Performance Tool (WPT), also called Akstress, was developed by IBM. WPT has been retired because it overlaps with the functionality provided by existing IBM tools, including Rational Suite Test Studio and the WebSphere Studio Workload Simulator.
- Trade3 is IBM's WebSphere-based end-to-end benchmark and performance sample application.
The servers used for testing were ApacheJakarta-Tomcat, IBM WebSphere Application Server, and Jboss.
For more on additions in the 2.6 kernel, read the presentation by Ulrich Weigand entitled "Whatâs new in Linux 2.6?."
- The Wonderful World of Linux 2.6, by Joseph Pranevich, is a wonderful compilation of features in the new kernel (though, admittedly, with a bias toward i386 Linux).
- "Improving Linux kernel performance and scalability" (developerWorks, January 2003), discusses testing performed by several members of the LTC, covering a diverse set of workloads, including disk and block I/O, SMP scalability, and others.
- "Putting Linux reliability to the test" (developerWorks, December 2003) presents the results of extensive reliability testing conducted on the 2.4 kernel.
Find more resources for Linux developers in the developerWorks Linux zone.
- Browse for books on these and other technical topics.