OLTP (Online Transaction Processing)
OLTP is the computing activity of serving requests originated by interactive users waiting for an answer in a acceptable time; whereas, batch processing refers to programs running in the background trying to finish the given amount of work in the designated window of time. The key performance metric for OLTP system is the response time experienced by the end user, while for batch systems, the reference metric is the throughput - the number of tasks completed in the unit of time. See H. H. Liu: Software Performance and Scalability: A Quantitative Approach – J. Wiley & Sons, for a more detailed discussion.
One of the benefits that the Java platform has introduced is the fact that it manages the memory for you, through the usage of the Garbage Collection (GC) algorithm. Unfortunately, a not well tuned GC can result in sub-optimal performance and scalability. The initial step of the GC tuning is choosing the GC policy according to the characteristics of the workload. Today's JVMs support multiple GC algorithms that you can control thought the command line options. The IBM JVM support 4 different policies through the -Xgcpolicy: option:
- optthruput - It is the default policy and is typically used for applications where raw throughput is more important than short GC pauses. The application is stopped each time that garbage is collected.
optavgpause - Trades high throughput for shorter GC pauses by performing some of the garbage collection concurrently. The application is paused for shorter periods.
gencon - Handles short-lived objects differently than objects that are long-lived. Applications that have many short-lived objects can see shorter pause times with this policy while still producing good throughput.
subpool - Uses an algorithm similar to the default policy's but employs an allocation strategy that is more suitable for multiprocessor machines. It is recommend this policy for SMP machines with 16 or more processors and is only available on IBM pSeries® and zSeries® platforms.
Ideally optavgpause should be the best choice for pure OLTP systems, while optthruput is better suited for batch processing (take a look to Java technology, IBM style: Garbage collection policies, Part 1 and Part 2 articles on developerWorks for a deeper analysis of garbage collection policies).
TPAe based systems are actually a mix of the two workload profiles and their behavior is characterized by the presence of a lot of transient objects. Based on these considerations, Tivoli performance decided to test the gencon policy that is able to provide a good trade-off between short pause times and good overall throughput. Results achieved in all the experiments that we did were very encouraging. As you can see in the following graph about one of the scenarios we run, the response time improved and became more stable, as is shown by a reduced standard deviation: