Level: Introductory Nikolay Yevik (yevik@us.ibm.com), Linux on POWER Technical Consultant, IBM
10 Feb 2005 This article introduces some of the important performance tuning issues for the IBM JVM for Linux on iSeries and pSeries. At the time of this writing, IBM provides JDK 1.3.1 32-bit and JDK 1.4.1, in both 32-bit and 64-bit flavors, for Linux on IBM iSeries and pSeries. Information in this article applies to IBM JDK 1.3.1 and JDK 1.4.1 for Linux on IBM iSeries and pSeries, but specifically targets JDK 1.4.1 SR2 as the latest IBM JDK release.
Writing performance-efficient Java code
The
IBM JVM Diagnostics Guides
for JDK 1.3.1 and JDK 1.4.1 are heavily referenced in this article.
This section has general guidelines on writing performance-efficient Java code. It specifically discusses how to avoid object creation and garbage collection (GC), JNI, synchronization, and data structures.
Avoiding object creation and GC
Whenever possible, avoid creating objects to prevent associated performance costs of calling the constructor, and subsequent cost of GC when an object reaches the end of its lifecycle.
Consider these guidelines:
- Use the primitive variable types instead of the object types whenever possible. For example, use int instead of Integer.
- Cache frequently used short-lived objects to avoid the need to repeatedly recreate the same objects over and over, and therefore invoke the GC.
- When manipulating strings, use StringBuffer instead of string concatenation due to the immutable nature of string objects, and the need to create an extra string object that eventually must undergo GC.
- Avoid excessive writing to the Java console to reduce the cost of string objects manipulations, text formatting, and output.
- Implement connection pools to the database and reuse connection objects, rather than repeatedly opening and closing connections.
- Use thread pooling. Avoid incessant creation and discarding of thread objects, especially if using threads in abundance.
- Avoid calling GC from within your code through
System.gc() call. GC is a "stop the world" event, meaning that all threads of execution will be suspended except for the GC threads themselves. If you must call GC, do it during a non-critical or idle phase.
- Avoid allocating objects within loops, which keeps the object alive on the Java heap longer than necessary.
Java Native Interface (JNI)
Writing portions of the application, especially heavily used portions, in native code and linking it with Java is usually intended to improve performance. However, communication between JVM and native code is generally slow, thus too many JNI calls can degrade performance. Native operations should be grouped together whenever possible to reduce the number of JNI calls.
Handling exceptions natively in the JNI code itself, though unavoidable sometimes, leads to performance degradation. In such cases, the ExceptionCheck() function should be used because it is less computationally expensive than ExceptionOccurred(). The latter has to create an object to be referred to, as well as a local reference.
Synchronization
To reduce contention in the JVM and operating system, use synchronized methods only when feasible. Do not include synchronized methods into a loop structure.
Data structures
As a general rule, avoid using a more complex data structure where a simpler one will suffice. For example, instead of vectors use arrays. Use the most efficient way to search and insert elements into a data structure, such as adding and deleting from the end of a vector for better performance.
Compilation options to increase performance
Compile your Java code with the -O optimization flag. Code optimization provides several benefits:
- obfuscates the code and makes it harder to reverse-engineer
- significantly enhances source code security
- significantly decreases the size of your Java program
- improves run-time performance
Environment settings to increase performance
Spinloop
Currently, adjusting the SPINLOOP variables and the timeslice values show the biggest performance gains. The IBM_LINUX_SPIINLOOP time is the number of times that a process can spin on a busy lock before blocking. There are three SPINLOOP variables available for adjustment (a number from 0 to 100):
- IBM_LINUX_SPINLOOP1
- IBM_LINUX_SPINLOOP2
- IBM_LINUX_SPINLOOP3
The benchmark testing performed on a 16-way LPAR suggests the following settings to be optimal:
- IBM_LINUX_SPINLOOP1=96
- IBM_LINUX_SPINLOOP2=85
- IBM_LINUX_SPINLOOP3=85
As with any other global variable, these variables need to be set in the shell instance where the JVM process will run, so that settings can be read by JVM into its global variables table.
Sysctl
As of SLES8, running kernel 2.4.19, there is an option for setting the minimum and maximum for CPU timeslices in the Linux kernel. These are set with the sysctl command. It is highly recommended that the sysctl value sched_yield_scale be set to 1 for Java performance.
Paths
The CLASSPATH variable should have the most often-used Java libraries in front of the search path. The same applies to LIBPATH and LD_LIBRARY_PATH variables for most often-used JNI shared libraries.
User limits settings
To achieve the best performance it is important that the user who runs the JVM process has the user settings appropriately configured. These parameters can be set either:
- Temporarily, for the duration of login shell session with the
ulimit command
- Permanently, by either adding a corresponding
ulimit statement to one of the files read by login shell ( ~/.profile, for example); shell-specific user resource files; or by editing /etc/security/limits.conf
Some of the most important settings recommended to be set to unlimited are:
- Data segment size:
ulimit –d unlimited
- Maximum memory size:
ulimit –m unlimited
- Stack size:
ulimit –s unlimited
- CPU time:
ulimit –t unlimited
- Virtual memory:
ulimit –v unlimited
For Java applications that do a lot of socket connections and keep them open, it is preferable to set the number of file descriptors for a user to a higher than default value by using ulimit –n, or by setting the nofile parameter in /etc/security/limits.conf.
GC and Java heap
Garbage Collector is one of the most important JVM components influencing JVM performance. General IBM JVM discussion (in IBM JVM Diagnostics Guides for JDK 1.3.1 and JDK 1.4.1) on GC and heap size tuning applies to IBM JVM on Linux, including Linux on POWER, with the exception of some IBM JVM on Linux specifics, discussed below.
The maximum heap size that is controlled by –Xmx can be set to a higher number on 32-bit IBM JVM for Linux than on 32-bit IBM JVM for AIX, due to differences in memory models between the two operating systems. If the –Xmx option is not specified, then the default setting applies (half of the real storage with a minimum of 16 MB and a maximum of 512 MB).
If initial heap size is not specified explicitly with the –Xms option, it defaults to 4 MB.
For more information on GC and Java heap tuning, see "Debugging Performance Problems: JVM Performance" in the IBM JVM Diagnostics Guides for JDK 1.3.1 and JDK 1.4.1. The chapters "Understanding the Garbage Collector" and "Garbage Collector Diagnostics" might be valuable.
JIT
JIT is the most important JVM component in terms of performance. For general IBM JVM JIT discussion, see the "Understanding the JIT" section of the JVM Diagnostics Guide. For Linux specific details on JIT performance, see the JIT section of "Linux Problem Determination" and "JIT Diagnostics."
Monitoring JVM
IBM JVM for Linux performance problem determination, JVM monitoring, and tools are discussed in detail in the "Linux Problem Determination" chapter of the JVM Diagnostics Guide.
The following chapters may be of additional value:
- Tracing Java Applications and the JVM
- Using the JVM monitoring interface (JVMMI)
- Using the Reliability, Availability, and Serviceability interface
- Using the JVMPI
- Using third-party tools
Linux threading models and JVM
There are some specifics in threading models implementations that influence JVM performance on different Linux distributions. See the "Linux Problem Determination" chapter of the JVM Diagnostics Guide for details.
Another issue to be aware of is a thread floating stack limitation on Linux, as discussed in the "Floating Stacks Limitation" subsection of the JVM Diagnostics Guide.
SLES 8 and IBM JDK 1.4.1
Users of SLES 8 Linux distribution should be aware of performance issues with SLES 8 kernel scheduler and JDK 1.4.1 for Linux from IBM, due to specifics of SLES 8 scheduler internal implementation. Read more on the issue in the "Linux Problem Determination" chapter, section "Known Limitations on Linux" in the JVM Diagnostics Guide.
Glossary
- GC
- Garbage Collector
- JDK
- Java Development Kit, includes JRE and development tools
- JIT
- Just-In-Time Compiler
- JRE
- Java Run-Time Environment, no development tools
- JVMMI
- Java Virtual Machine Monitoring Interface
- JVM
- Java Virtual Machine
- JVMPI
- Java Virtual Machine Profiling Interface
- NPTL
- Native POSIX Threads Library
- OS
- Operating System
- RHEL AS
- Red Hat Enterprise Linux Advanced Server Edition
- SLES
- SUSE Linux Enterprise Server
- SR
- Service Refresh
Resources
-
The Linux Documentation Project is a repository of Linux documentation including documents about individual software, HOWTO documents, FAQs, and more.
- "Build a better GUI" (developerWorks, October 2001) discusses the use of Java layout managers for better overall GUI design.
-
Practical UNIX & Internet Security
(O'Reilly & Associates; 1996), by Garfinkel and Spafford, is an excellent reference on all aspects of system security from user management to drafting a security policy.
About the author  | |  | Nikolay Yevik, a Linux on POWER consultant in IBM’s Solutions Enablement group, has more than 5 years of experience working on UNIX platforms, performing development work in C, C++ and Java. He has Masters degrees in Petroleum Engineering and Computer Science. Nikolay can be reached at yevik@us.ibm.com |
Rate this page
|