The IBM Developer Kit for version 5.0 of the Java platform marks a significant step forward for Java developers, with advances in the language features as well as major enhancements to the underlying execution technology. This article, the first in a six-part series, provides an overview of some of the major changes and improvements that IBM has made to its virtual machine technology, including generational garbage collection, the sharing of class data, and improvements in monitoring and debugging tools and APIs. But before we look at the improvements in IBM implementations, we'll take a look at the advances in the Java 5.0 itself. The series will also include an overview of the additional security providers IBM has included in its implementation of the Java platform.
Enhancements to Java 5.0
Java 2 Standard Edition (J2SE 5.0) introduces the greatest number of feature enhancements to the Java Class Library (JCL) API and the Java Virtual Machine (JVM) specification since the introduction of the Java 2 platform. These features are available in all 5.0 implementations from all vendors of Java technology and are focused largely in two areas: ease of development and monitoring and management.
The ease-of-development features in the 5.0 release are designed to allow you to carry out simple constructions with less code and to build in more compile-time checks to help you find problems earlier in the development cycle. Here's a quick rundown:
- Compile-time type safety with generics: Generics are similar to C++ templates. General or generic classes are independent of a particular type, and subsequently introduce type safety by the use of a parameterized type when instantiated. The use of a parameterized type with a generic class allows compile-time type safety checking and is used by the collections classes delivered in the Java 5.0 platform.
forloops: This new language construct, which is similar to the
for eachloop in other languages, simplifies the process of iterating over collections and arrays by removing the need to use explicitly defined iterators and index variables.
- Auto-boxing of primitives: This feature simplifies the process of inserting primitive types into collections objects by removing both the need to box Java primitive types (such as
int) into their corresponding wrapper class (such as
java.lang.Integer) and the subsequent need to unbox when removing them.
- Type-safe enumerations: This feature introduces Java language support for enumerated types, offering a more powerful and type-safe solution than employing static final declarations.
- Support for importing constants: This facility allows static methods and fields to be imported, avoiding the need to use fully qualified class names when accessing static members.
- Java Language Metadata (annotations): This feature allows developers to add annotations to code. Annotations serve as modifiers that can be added to packages, classes, interfaces, methods, or field declarations. This information is stored as part of both the source and class files and is obtainable by tools or by a Java application through the Java Reflection API. This additional information is used by tools for the purposes of documentation, compiler checking, and code analysis.
- Concurrency utilities: This feature introduces basic building blocks for developing concurrent classes, including thread pools and thread-safe collections, and introduces low-level locking primitives, including semaphores and atomic variables.
Monitoring and management features
J2SE's new monitoring and management features are designed to make it much easier to monitor the status of the Java runtime. You can tap these capabilities from Java code and JMX using the monitoring and management API or from C code using the JVM Tools Interface (JVMTI):
- Monitoring and management API: This feature enables Java programs or remote agents to both monitor the "health" of the virtual machine and observe other system-level activities and events. You can develop autonomic and self-adapting systems by exploiting these features.
- JVM Tools Interface: The JVMTI is being introduced as a more lightweight, flexible replacement for the JVM Profiling Interface (JVMPI) and is a C-based interface for writing tools for development and runtime monitoring.
Value-added enhancements from IBM: An overview
The specification and API changes added in 5.0 through the Java compiler, the JCL API, and the JVM specification affect all new implementations of the Java platform; in addition, Java vendors are permitted to produce and deliver their own value-added enhancements to their Java deliverables. IBM delivers its enhancements in two forms: IBM-produced Java language extensions and improvements to the IBM implementation of the Java runtime environment.
Java language extensions
The IBM value-added Java language extensions cover three main components: the object request broker (ORB), XML, and security. These components are IBM-provided code and are developed and supplied to provide features required by customers and certain IBM products:
- The ORB: IBM led the development and inclusion of RMI-IIOP into J2SE as a replacement for RMI-JRMP. Since then, IBM has continued to produce its own implementation to fulfill customer requirements and to ensure interoperability between releases, particularly for the WebSphere Application Server.
- XML and XSLT: IBM produces an implementation of the XML/XLST specification based on the Apache Xerces Java and Xalan Java open source projects, to which IBM is a major contributor. The IBM packages contain additional XML APIs not specified in the Java 5.0 specification, as well as the Xerces Native Interface and XML Schema API.
- Security: IBM provides a wide range of security services through the standard Java APIs. The security components contain IBM implementations of various security algorithms and mechanisms. In addition to the basic security providers, IBM also provides for FIPS compliance and support for cryptographic hardware, where available. In addition, the iKeyman utility is provided to manage keys and certificates.
Java runtime improvements
The IBM implementation of the Java runtime has undergone a vast development effort for 5.0, affecting all three of the major runtime components: the virtual machine (VM), the garbage collector (GC), and the just-in-time (JIT) compiler. The effort was applied with two main goals: superior application execution performance coupled with improved reliability, availability, and serviceability (RAS) characteristics.
These improvements have been achieved both through a move to a common code base for all of the Java runtime flavours (Micro Edition (ME), Standard Edition (SE) and Enterprise Edition (EE)) for which IBM produces ports and by the introduction of individual enhancements to each of the three components.
We'll discuss all of these changes relating to the Java runtime in more detail over the remainder of this article and in subsequent articles in this series.
A common code base
IBM has long produced all three editions of the Java platform. As of the J2SE 5.0 release, all the underlying components of IBM implementations of the Java runtime are built from a common code base.
The common code base is structured using a system of framework engines and pluggable configurations, allowing maximum code sharing while catering to any functional differences required by each of the Java editions. This improves the IBM J2SE SDK's memory footprint, start-up time, and performance characteristics and provides both scalability and serviceability enhancements to the IBM J2ME SDK. Both editions benefit from greater levels and wider ranges of tests being run against the same common code, leading to improved stability and reliability characteristics.
Garbage collector improvements
In addition to the move to a common garbage collection framework with pluggable configurations, there have been four additional major improvements to the GC component: movement from a conservative to a type-accurate collector, introduction of a parallel collector, introduction of a generational and concurrent collector, and a rework of the verbose GC logging facility.
The previous implementations of the IBM IBM J2SE SDK contained a conservative garbage collector. Conservative collectors are so called because they assume that every value held inside a thread's Java stack or a thread's registers could potentially be a reference to a Java object.
This means that the garbage collector has to trace each of these values and determine if they point to an object on the Java heap, which can and does lead to scenarios where objects are marked as being referenced when they are not -- the value on the Java stack or in the registers may in fact be a simple
long value that happens to also point to a location where an object resides. In these scenarios, a condition known as retained garbage occurs, as objects that are not actually live are kept between garbage collections. This leads to applications occupying a larger memory footprint than they strictly require.
A second effect of the use of a conservative collector is the need to pin and dose Java objects. Objects are pinned when they are referenced from JNI (native) code. Objects become dosed when they are referenced (either intentionally or unintentionally, as per retained garbage) from a Java stack or register. Pinning and dosing prevents the object from being moved on the Java heap during the compaction of Java objects by the garbage collector. This occurs because the references to the objects in the Java stacks and the registers cannot be updated with the new location of the object when it is moved; in the case of retained garbage, it would actually change a value that does not in fact reference a Java object.
The inability to move these objects during the compaction phase of garbage collection prevents the fragmentation of objects on the Java heap from being fully eliminated, leaving residual fragmentation. This problem prevents objects from being allocated even if there is apparently enough free memory on the Java heap and leads to the need to run an application with a larger Java heap than should be required.
Both of these problems are solved by moving to a type-accurate collector, which maintains its own well-described, type-accurate stacks. This removes the problem of retained garbage, as we are fully aware whether a value does, or does not, reference a Java object. As we are able to edit these type-accurate stacks, we also remove the problem of residual fragmentation, as objects are no longer prevented from moving during compactions.
Compaction, when it occurs, is the most time-consuming phase of running a garbage collection cycle. To reduce the duration of compaction, parallel compaction has been introduced into the IBM implementation of Java 5.0 technology.
Parallel compaction allows multiple threads to assist in moving objects on the Java heap to coalesce large numbers of small free spaces into a small number of larger spaces. This is done by using a thread for each CPU available to the process and segmenting the Java heap into a number of nominal regions, with threads taking complete ownership of the compaction of each of the regions.
Generational and concurrent collector
The generational and concurrent (or gencon) collector exploits the weak hypothesis that "objects die young" and does so by creating a two-generation Java heap. By separating newer and older objects into the two generations, collections can be concentrated on the younger objects. The young generation, or nursery, uses a semi-space copying collector, and the old, or tenured, generation uses a concurrent mark sweep collector.
Verbose GC logging update
To improve the RAS characteristics of the GC component, the GC logging mechanism has been updated in two ways. The logger provides more detailed information and presents it in an XML format rather than as flat text; it also traces a subset of this data into an in-memory buffer for first failure data capture.
The move to an XML-based structure for the verbose GC output allows easier data viewing through simple XML readers such as a Web browser and allows easier parsing by the various verbose GC analysis tools available.
The in-memory trace buffer, which is "snapped" to file on a failure scenario or on a user request, provides basic information about Java memory usage and the state of GC even if verbose GC has not been enabled, and it greatly improves the quality of first failure data capture information.
Although the JIT compiler is largely a black box to users of the Java platform, it is the largest contributor to application execution performance of the Java runtime. By converting Java bytecode into optimized machine code at runtime, it greatly increases the speed at which Java methods are run.
A number of improvements and changes have been made between 1.4.2 and 5.0 to improve the performance of the IBM JIT compiler, while reducing the impact of the act of JIT compilation on the running application. The major changes are listed in Table 1:
Table 1. Improvements in IBM's JIT compiler
|Compiles synchronously on the Java thread executing the Java method||Uses a separate asynchronous compilation thread|
|Methods are compiled on demand||Methods are queued for compilation|
|The native stack is used for Java methods||A separate Java stack is maintained|
|A single compilation optimization is available||Five levels of compilation optimization are available|
|Methods can only be compiled once||Recompilation can occur|
|On-stack replacement of methods containing loops is possible||No on-stack replacement|
|JIT is disabled when ||JIT continues to compile at reduced optimization levels in debug mode|
Of these improvements, the major new features are asynchronous compilation, multiple levels of optimization, and profiling-driven recompilation.
The JIT compilation of a Java method now occurs asynchronously on a dedicated thread separate from the calling thread. This means that the thread that triggered the compilation of a particular Java method by calling it no longer has to block while the compilation completes before being able the run the method. The method is instead added to a compilation queue, and the thread is allowed to continue running and execute the non-compiled version of the method. Once the method is compiled, the next call to the method runs the JIT compiled version. To ensure that frequently used methods are compiled in preference, the queue is constantly reordered and reprioritised.
Asynchronous compilation can greatly improve performance at startup on multiprocessor machines, as the majority of application startup is single threaded, and the compilation of Java methods is carried out by a separate processor.
Multiple levels of optimization
The JIT compiler now has the ability to compile Java methods at one of five levels of optimization. By the use of a separate sampling thread, the compiler determines how much time is being spent in a particular Java method and therefore determines whether it needs to be JIT compiled, and to what level of optimization it should be compiled. The most frequently executed Java methods are compiled at the higher optimization levels, which provides the greatest performance gains but also has the greatest cost in terms of time to complete compilation and, potentially, in terms of memory requirement. The less frequently used methods are compiled at lower optimization levels, which are designed to complete quickly and produce a significant performance gain.
The JIT compiler has the ability to recompile Java methods should a higher optimization level be required because of changes in the way the application is running. At the highest optimization level, the recompilation is performed based on dynamic profiling data automatically generated from the executing code. This provides information on how the method itself is actually being used. After collecting this data for a short period, the method is then recompiled using this data and optimized accordingly.
Virtual machine improvements
In the IBM Java 5.0 SDK, the virtual machine, like the other components, has had a great many improvements. The two most prominent are the implementation of shared classes and new features in profiling and debugging.
Shared classes were previously only available on the IBM implementations of the Java platform for z/OS and OS/390. That implementation of shared classes has since been deprecated and replaced with a new implementation on all platforms.
The new implementation maintains a static class data cache in shared memory that is shareable between all the IBM-produced Java runtimes that implement the new shared classes functionality; this cache persists between invocations of the Java runtime. The shared classes functionality applies to all JCL and classpath-based classes and can easily be made to apply to classes loaded by custom class loaders using a simple API. This provides savings in terms of memory footprint and in terms of startup time once the cache has been populated. This feature is particularly relevant in scenarios where there is more than one Java runtime on a machine or in an environment where the runtime is likely to be restarted often.
Profiling and debugging
In addition to supporting the Java Debug Wire Protocol (JDWP), JVM Profiling Interface (JVMPI), and JVM Tools Interface (JVMTI), the IBM debugger implementation has two additional features: high-speed debug and hot-code replace.
High-speed debug allows the JIT compilation of Java methods to occur even while the Java debugger is running and for that compilation to occur at almost full optimization. This is useful when debugging large Java applications -- those running on a J2EE stack, for example -- where the debugging performance would be debilitating with the JIT disabled.
Hot-code replace allows you to dynamically make changes to the source code while under the debugger and to immediately run the new code without restarting the application.
Having the ability to run code at near full speed and dynamically change that code under the debugger allows greatly improved application development productivity.
Reliability, availability, and serviceability improvements
IBM implementations of the Java platform have always provided the infrastructure and tooling for monitor and debug failures in both Java applications and the Java runtime itself. Many improvements have been built into the Java runtime in this area, including continuous low-level tracing to internal buffers as an aid to first failure data capture, a new option for monitoring native memory usage, and a strong JNI code validator, as well as a reworking of the dump and trace engines. The Java 5.0 implementation also adds a Java-based tooling API for interrogating system dump files that allows tools developers to access information about objects, threads, locks, and the like from a dump without requiring knowledge of the JVM's internal structures.
The trace engine has been restructured and an internal "flight recorder" added. This constantly traces key VM and JCL trace points into per-thread, wrapping buffers. GC data is also traced into a separate wrapping buffer to ensure that the data is easily obtainable and to provide a reasonable history of GC cycles.
In addition to the internal tracing of the VM, the method trace functionality provides the ability to trace method entry and exit events of Java code, both JCL-provided and application code. This does not require any changes to the application code and provides timestamp, thread ID, and parameter information.
As part of the rearchitecture of the dump engine, the number of events on which dumps can be triggered has been extended from 3 to 14. Dumps can now be generated on events like VM stop, class loading and unloading, thread start and stop, GC cycles, and exceptions being thrown, caught, and left uncaught. These new capabilities, combined with the ability to filter events, give much greater flexibility as to when dumps can be created and what types of dumps can be created by the Java runtime.
The DTFJ Tooling API
The Diagnostic Toolkit and Framework for Java (DTFJ) API is a Java-based API for accessing postmortem information from the system dump of a Java process. This allows tool writers to access information from the dump about the system, the process, the Java VM, and the Java application without having to understand how the relevant structures are laid out in memory. This makes the ability to write postmortem tooling far more accessible.
The delivery of the IBM Developer Kit for Java 5.0 introduces a vast number of features and functionalities. While some of those enhancements, including performance and reliability improvements, are transparently provided to users migrating to version 5.0 from previous releases, other enhancements require invocation. In the upcoming installments of this series, we look more in-depth at the technology of some of the value-added enhancements from IBM, including garbage collection policies, shared classes, and debugging features, and see how they can be utilized.
- Java 5.0 feature list: The complete rundown of features from Sun.
- Learn more about Java security features: Documentation, example code, and ancillary files relating to IBM's J2SE 5.0 SDKs.
- Clarifications and Amendments to the Java Virtual Machine Specification: Read about the changes to the JVM specification for the Java 5.0 platform.
- Taming Tiger series: John Zukowski's series on Java 5.0 is essential reading for anyone considering upgrading.
Get products and technologies
- IBM Java SDKs: Download the SDKs for AIX, Linux, and z/OS, among other IBM developer kits for Java technology, from this page.
- IBM Development Package for Eclipse: Develop, test, and run your Java applications with this ready-to-run Java development environment.
- WebSphere Everyplace Micro Environment: A production-ready runtime environment, tested and certified to meet J2ME specifications.
- IBM Development Package for Apache Harmony: An execution environment designed to run code contributed to the Apache Harmony project.
- IBM SDKs and Runtimes: Visit this discussion forum, moderated by series lead Chris Bailey, for questions related to the IBM Developer Kits for the Java Platform.