One of the primary reasons for Java's success in the Enterprise world is its ability to integrate seamlessly with multiple technologies. The ease of configuring Java to work with multiple databases and other middleware/backend applications often hides the details of the underlying hardware and operating system from the end-user. The software documentation describes some restrictions, and as long as you follow the specified installation guidelines, the applications run as expected.
However, for integration engineers, system architects, performance and sizing engineers, and administrators, this black-box approach can be counterproductive. Not only is it important to understand the "big picture", it is essential to understand why a platform-specific restriction is in place for tuning, debugging and sizing efforts. So if you have ever been told by a service technician to switch from one particular type of configuration to another, and would like to know more about why such a change is needed, this article is written for you. Before you get started, please read Getting more memory in AIX for your Java applications.
Specifically, we look at configuration changes recommended when using Java heaps larger than 1 GiB on an application based on 32-bit Java on AIX. With such large heaps, there is a chance that Java Virtual Machine (JVM) and other in-process software attempt to use the same address space. We call this"Segment Collision", and as a real-world example we look at various problems that could arise due to misconfiguration when a Java-based application is connecting to a local DB2 server.
It is essential to understand that this is not a DB2-specific discussion. As another example, we discuss how Websphere MQ installations can also run into the segment collision problem, and how to avoid such a situation. The focus should be on understanding why this problem can occur, and how to avoid it. Once you learn the concepts, they can be easily applied to any other middleware or database product.
The terms "32-bit Java" or "32-bit AIX" are used here to refer to the 32-bit memory model, that is, a memory model where pointers are 32 bits in size. The addressing space is 232, that is 4 GiB, and the AIX Operating System divides this address space into 16 segments of 256 MiB each. These segments are numbered 0 through F (hex). The advantage of numbering the first segment as "0" is that it becomes trivial to calculate the segment for a particular address. All that is needed is to look at the first digit of the 8-digit hexadecimal address (since 256 MiB in hex is 0x100000000). For example, address 0x65745640 belongs to segment 6.
The Resources section later in this paper mentions several articles and other sources of information about the 32-bit memory model for AIX, and how Java uses it. From Getting more memory in AIX for your Java applications, recall that when you attempt to use more than 1 GiB of heap for Java in 32-bit AIX, Java switches to an allocation mechanism that uses mmap() instead of malloc(). Using svmon, you can tell which segments are being used for Java heap. Using MAXDATA, you can control exactly which segments are used by the JVM. Note, though, that all the segments needed for Java heap must be contiguous. That is, if you are looking for 5 segments for Java heap, you cannot ask for segments 5,6,7, 9 and A to be used; it has to be 5-9 or 6-A.
Why would the address space usage of the JVM affect a different application like DB2? The answer is clear and simple: it cannot. The Process Address space is distinct, so whatever is done in the address space of the JVM is totally invisible to any other process address space (except for shared memory, as we will see in a moment). The Segment Collision can occur only between components that are "in-process". It is important to understand what "in-process" means, because it is central to the discussion for the rest of this article.
When the JVM loads a shared object (also known as a library in many operating systems), it is loaded "In-Process" (or "inproc"). The term "inproc" simply means that the shared object in question is mapped inside the address space of the JVM. This is an essential first step to use any of the code/data inside this shared object, and even the JVM itself is composed of a small "launcher" and multiple shared objects or libraries. For our discussion, we use the terms "library" and "shared object" interchangeably
A library's code is usually mapped to segment D, and its data is mapped to segment F. If you look inside a typical Javacore, you would see something like this:
2XHLIBNAME /usr/WebSphere/AppServer/java/jre/bin/libjitc.a 3XHLIBSIZE filesize : 2735441 3XHLIBSTART text start : 0xD1717000 3XHLIBLDSIZE text size : 0x234EAC 3XHLIBLDORG data start : 0x36E8F708 3XHLIBLDDATASZ data size : 0xF6C4 2XHLIBNAME /usr/lib/libc.a 3XHLIBSIZE filesize : 6609447 3XHLIBSTART text start : 0xD01CA95C 3XHLIBLDSIZE text size : 0x1DA 3XHLIBLDORG data start : 0x36E8E95C 3XHLIBLDDATASZ data size : 0x0 ------ Lines omitted ------- 2XHLIBNAME /usr/lib/libC.a 3XHLIBSIZE filesize : 4303699 3XHLIBSTART text start : 0xD0A85C40 3XHLIBLDSIZE text size : 0x678D 3XHLIBLDORG data start : 0xF02F6640 3XHLIBLDDATASZ data size : 0xCB8 |
The "text" or code of the libraries is being loaded in segment D (the first digit of the hexadecimal address, for example, 0xD1717000 for libjitc.a), while the "data" is being allocated from segment 3 (because it is using malloc(), which uses segment 3 onwards for large data model) or segment F (which is dedicated for shared library data) as shown for second chunk of libc.a.
Most databases and middleware applications provide an inproc variant of their client, wherein a library is loaded in the address space of the JVM. The library attaches to shared memory and uses it for further communication with the server. By the definition of "shared memory", the server must exist on the same machine, and this is usually a highly optimized, extremely efficient way to connect to a local server since communication is reduced to local reads and writes. However, this also implies that now it is possible that multiple contenders for the same address space within the JVM process can exist. This is not a problem in languages like C that leave memory (and address space) management to the programmer. But for Java, the address space is opaque as far as the programmer is concerned. When any loaded library attempts to allocate memory or otherwise manipulate the address space, the JVM depends on the underlying operating system to provide the demarcation between JVM-managed and other address space. This is where a "Segment Collision" can occur.
So what would cause a segment collision? There are (at least) two sources, based on whether the collision is unintentional or intentional.
The first source for segment collisions can be easy to fix but sometimes very difficult to debug. One or more of the libraries can have a bug in the code, either reusing a freed pointer, or overwriting part of memory that doesn't belong to it (for example, using array writes outside the bounds). This can result in sporadic crashes, or threads dying, or worse; it is quite difficult to catch one of these in the act unless the misbehavior causes a SIGILL (Illegal instruction execution, caused due to bad code pointers), SIGSEGV (Segmentation fault, caused due to bad data pointers) or similar fault conditions.
Alternatively, the libraries can be having a requirement to use specific parts of the address space for things like shared memory. So, if your favorite middleware demands that segment 9 be reserved for it, the JVM process cannot exceed a heap size of 1.5 GiB. If you don't understand how this value was calculated, please refer to Getting more memory in AIX for your Java applications. More common is a misconfiguration, where a library is using segment 9 because no one changed the default configuration. The result of this kind of segment collision is much cleaner, usually an error message or a diagnostic code. Resolution involves a simple search of the appropriate documentation for the error message, and following the documented steps.
Problems due to defects have to be debugged, and the piece of code containing this problem has to be fixed. We do not discuss these further in this article. From here on, we concentrate on issues seen in the field that result due to misconfiguration, and attempt to explain the rationale behind the proposed solutions/workarounds.
A Real-World Application : DB2
This section is hands-on, and we strongly urge you to try the various commands and techniques described in this article. We use DB2 as the database, and a home-brewed JDBC application in this section to illustrate the concept. A similar situation and workaround for Websphere MQ is discussed in the next section, and it should be easy to translate the steps to these or any other software of your choice.
The "ShowDB" application used below is a trivial JDBC application whose pseudo-code is given in Figure 1. Please note that, if at all, this should be treated as an example of how NOT to write a JDBC application. We deliberately ignore pooling and other good practices in order to get the behavior we want to illustrate.
Figure 1. Pseudo-code for "ShowDB" application
Parameters: tableName and threadCount
do the following code threadCount times
create a new thread that will
open a JDBC connection to the DB2 server
query the tableName for its contents, printing the data retrieved
sleep for a few seconds
close the JDBC connection
exit thread
done
|
The application takes two arguments on the command-line: the name of the table and number of threads to be launched. If you specify the following:
java ShowDB Staff 50 |
it is going to create 50 threads that will simultaneously connect to the DB2 server, query the "STAFF" table, and print records from it. The information about type of driver, name of the DB2 server, and so on, is read from an input file. Switching from one JDBC driver type (for example, Type 2 to Type 4) to another is a simple matter of switching the input files.
We have taken two design decisions to trigger the behavior we want to see. One, we force each thread to create a new database connection, and keep it open until all records are printed. Two, we add a delay before closing the JDBC connections, so the application keeps the connections alive for several seconds. This gives us enough time to observe the process address space using svmon.
The traces shown below were taken on an AIX 5.1 system, running DB2 Enterprise Server Edition v8.1 (DB2 Server) installation. The db2sampl program, available with all DB2 installations, was used to generate the sample database for our tests. Java 1.3.1 (32-bit) build ca131-20030630a was used for the JDBC application, and the driver string used was "COM.ibm.db2.jdbc.app.DB2Driver", which translates to a Type 2 or "App" driver (see The JDBC driver string is defined in "Application Development Guide : Programming Client Applications" under section "Application Development" on the DB2 Technical Support site) .
Given the above configuration, how many simultaneous JDBC connections can you establish concurrently? Let's try 5 first. So you try the following command:
java ShowDB Staff 5 |
But instead of 5 threads, you see three occurrences of the following error message:
SQLException: SQLState: 08001 Message: [IBM][CLI Driver] SQL1224N A database agent could not be started to service a request, or was terminated as a result of a database system shutdown or a force command. SQLSTATE=55032 Vendor: -1224 |
So out of 5 threads, three failed to establish a connection with the DB2 server. The other two were able to query the database and print the data correctly, so it cannot be a problem with the JDBC setup, or the database setup. What went wrong?
We could, of course, look at the description of SQL1224N at this stage, but let's try a little detour. Since we know that the topic in question is about address space usage, we use svmon to get an idea of what the address space of the JVM looks like. The svmon output for the JVM (process ID 69938 in the trace below) on our machine is the following:
$ svmon -P 68938 -m
-------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd LPage
68938 java 23652 1833 778 22118 N Y N
Vsid Esid Type Description LPage Inuse Pin Pgsp Virtual
15015 d work shared library text - 12346 0 3 14162
0 0 work kernel seg - 4380 1822 775 4380
f2ad 3 work shmat/mmap - 3380 0 0 3380
f1ed f work shared library data - 108 0 0 108
1c59c 2 work process private - 49 2 0 49
10590 1 pers code,/dev/hd2:331803 - 8 0 - -
1c11e b work shmat/mmap - 2 0 0 2
72a5 e work shared text ovfl - 2 0 0 2
15177 c work shmat/mmap - 1 0 0 1
c2ce 7 work shmat/mmap - 0 0 0 0
1f27d 6 work shmat/mmap - 0 0 0 0
c24e 4 work shmat/mmap - 0 0 0 0
1c27e 9 work shmat/mmap - 0 0 0 0
17275 8 work shmat/mmap - 0 0 0 0
d22f 5 work shmat/mmap - 0 0 0 0
c40e a work shmat/mmap - 0 0 0 0
|
Note that the svmon output is larger than the one shown above, but for brevity, all svmon output will omit lines that have "Esid" set to "-" unless they are essential for the discussion.
Remember from Getting more memory in AIX for your Java applications that since you are using a heap size lower than 1 GiB, the JVM is not using mmap() for its heap allocation. Instead, it is using malloc() and has MAXDATA set to 8, so segments 3 through A will be used for Java heap. Segments B and C, however, have their "Inuse" value set to 2 and 1, respectively. Since the "Inuse" values are in 4K chunks, this means that 8 KiB from segment B and 4 KiB from segment C are being used from inside the JVM. Also, segment e is in use as well. But the JVM doesn't use any of these segments, so who could it be? Could it be the DB2 App driver?
To check, try running with just one connection to the database, and you see the following in svmon:
$ svmon -P 68858 -m
-------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd LPage
68858 java 23214 1829 778 21680 N Y N
Vsid Esid Type Description LPage Inuse Pin Pgsp Virtual
15015 d work shared library text - 12346 0 3 14162
0 0 work kernel seg - 4380 1822 775 4380
71c5 3 work shmat/mmap - 2960 0 0 2960
1c59c f work shared library data - 108 0 0 108
1c27e 2 work process private - 49 2 0 49
10590 1 pers code,/dev/hd2:331803 - 8 0 - -
1c11e b work shmat/mmap - 2 0 0 2
15177 c work shmat/mmap - 1 0 0 1
172d5 9 work shmat/mmap - 0 0 0 0
1f27d 4 work shmat/mmap - 0 0 0 0
d22f 8 work shmat/mmap - 0 0 0 0
c40e 6 work shmat/mmap - 0 0 0 0
c24e a work shmat/mmap - 0 0 0 0
7245 7 work shmat/mmap - 0 0 0 0
17275 5 work shmat/mmap - 0 0 0 0 |
With this run, we still see segments b and c in use, but segment e is no longer used. So we did see at least one segment usage go away for one less connection. So, if we make more segments available for these connections, can we have more connections? We realize that our tiny sample application is not even using one segment of heap, so keeping 8 segments reserved is too much. So, using the knowledge you learned from Getting more memory in AIX for your Java applications, you change the JVM's memory model to default by setting MAXDATA as 0, and try again:
LDR_CNTRL=MAXDATA=0x00000000 java ShowDB Staff |
And it works. In fact, if you look at the svmon output, you will notice that you still are not using segments 8 through b and e:
$ svmon -P 44424 -m
-------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd LPage
44424 java 23654 1807 778 22119 N Y N
Vsid Esid Type Description LPage Inuse Pin Pgsp Virtual
15015 d work shared library text - 12347 0 3 14162
0 0 work kernel seg - 4380 1796 775 4380
11f50 2 work process private - 3424 2 0 3424
1817a f work shared library data - 108 0 0 108
10590 1 pers code,/dev/hd2:331803 - 8 0 - -
51e7 7 work shmat/mmap - 2 0 0 2
151d7 4 work shmat/mmap - 2 0 0 2
21e0 6 work shmat/mmap - 2 0 0 2
1a218 5 work shmat/mmap - 2 0 0 2
15197 3 work shmat/mmap - 2 0 0 2
e50e c work shmat/mmap - 1 0 0 1
|
So you can actually go all the way up to 10 simultaneous connections. Try it, and you will see that with MAXDATA=0, 10 connections do work.
From the svmon traces it seems that for each JDBC connection, one segment is being used, and the InUse value indicates that only 8 KiB out of 256 MiB is being used. One extra segment that has InUse set to 1 is actually for DB2 tracing, and is present regardless of the number of JDBC connections so it can be ignored.
So our little experiment has allowed us to go from 2 connections to 10 connections. While it is an improvement, there are two problems that make this an unacceptable solution. First, this solution means we are constrained to a very small Java heap, and Getting more memory in AIX for your Java applications gives reasons why using MAXDATA=0 is a bad idea. Second, a limit of just 10 simultaneous connections still sounds too small. So let's do now what we should have done in the first place: look up the information about the reported error.
If you recall, the failure message indicated an error code when a connection to the DB2 server could not be established. The error code was SQL1224N, so if you look in the Message Reference found on the DB2 Technical Support site, there are several potential reasons mentioned for this particular error. The one that looks most promising is:
The application is using multiple contexts with local protocol. In this case the number of connections is limited by the number of shared memory segments to which a single process can be attached. For example, on AIX, the limit is ten shared memory segments per process. |
This is exactly what we observed in our experiments with MAXDATA. Reading further, it says:
If the application is using multiple contexts with local protocol, then either reduce the number of connections in the application, or switch to another protocol (for example, TCP/IP). For users on AIX version 4.2.1 or newer, the environment variable EXTSHM can be set to ON to increase the number of shared memory segments to which a single process can be attached. |
Barring the obvious answer of reducing the number of connections, there are two options available to us here: use a different protocol or use EXTSHM. Let's explore both the solutions.
The easiest way out is to use Type 4 driver instead. With Type 4 driver, you are not using shared memory for any local connection, so you can go as high as you like with the number of connections. To prove that no shared segments are in use, you can allocate a 2560 MiB fixed-size heap. This means Java is going to reserve all 10 segments for its use, leaving no segments for the JDBC driver to use. This would be impossible to do in a Type 2 driver, but it works fine with Type 4 driver.
Type 4 drivers are available only from DB2 v8 onwards, so what happens if you are using DB2 v7? You can still configure the type 2 driver to use TCP/IP network protocol by cataloging the DB2 database to use local loop internet protocol (IP) address. Refer to "CATALOG TCPIP NODE …" see the Command Reference on the DB2 Technical Support site for more details.
Alternatively, you can continue using shared memory segments and still have multiple connections. The trick is to use EXTSHM=ON.
The usage of EXTSHM is described in the AIX Documentation. One of the articles describing EXTSHM is at http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixbman/prftungd/2365c78.htm and can be used as a starting point. If you recall, the InUse value for each of the segments being used by DB2 was just 2, indicating only 8 KiB of address space was needed. But each connection still occupied the full segment of 256 MiB. We need a way for multiple connections to be allocated from a single segment, and EXTSHM serves exactly that purpose.
To see the effect of this change, we need to set the environment variable for both client and server. For informationon how to do this, see the Application Development Guide : Programming Client Applications under the "Application Development" on the DB2 Technical Support site. Once you make the change in the database, you can try this in the application to see if it works or not:
$ svmon -P 46014 -m
-------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd LPage
46014 java 24227 1812 778 22662 N Y N
Vsid Esid Type Description LPage Inuse Pin Pgsp Virtual
15015 d work shared library text - 12438 0 3 14225
0 0 work kernel seg - 4380 1796 775 4380
21e0 3 work shmat/mmap - 3856 0 0 3856
1d4fd f work shared library data - 108 0 0 108
17517 2 work process private - 48 2 0 48
10590 1 pers code,/dev/hd2:331803 - 8 0 - -
18518 - work - 1 0 0 1
b62b 9 work shmat/mmap - 0 0 0 0
184d8 5 work shmat/mmap - 0 0 0 0
1a1f8 c mmap mapped to sid 18518 - 0 0 - -
828a 7 work shmat/mmap - 0 0 0 0
6506 8 work shmat/mmap - 0 0 0 0
1b2b9 4 work shmat/mmap - 0 0 0 0
22c0 6 work shmat/mmap - 0 0 0 0
1f27d a work shmat/mmap - 0 0 0 0 |
The above is for a run where 10 simultaneous connections were successfully established against the DB2 server. Since segment c is mapped to sid 18518, we have shown sid 18518 in the above trace. But the relevant information from svmon above is that the number of segments being used on the client side has come down drastically.
EXTSHM is not the recommended workaround, since some applications may break if EXTSHM is being used, and you still have not eliminated the usage of shared memory in the application. We show it here only for the sake of completeness. Also, keep in mind that the above svmon output is showing us implementation-specific information that may be completely modified in the future. The documentation accompanying DB2 is the definitive guide, and the above should only be used to understand the concept (as well as to learn a technique for finding out how, or if, an application is using shared memory).
By now it should be clear that the segment allocation problem occurs due to the segmented architecture of AIX and the way the JVM allocates heaps larger than 1 GiB. As another example, let us examine an application using MQSeries that can run into this issue.
When an application establishes a bindings (non-client) connection to a queue manager, MQSeries v5.2 (and before) uses segment 8 to attach shared memory and complete the connection. If segment 8 is unavailable, the connection will fail with reason code MQRC_Q_MGR_NOT_AVAILABLE (2059), and the application will generate an FDC file in the /var/mqm/errors directory showing a Probe Id of XY341019 and a Component of RetryConnectToSharedSubpool.
The insistence on segment 8 meant that Java could not go beyond 5 segments, or 1280 MiB, of heap. To accommodate situations where larger heaps were needed by Java applications, MQSeries 5.2 introduced a new attribute to the QueueManager: stanza of the /var/mqm/mqs.ini file. IPCCBaseAddress names an alternate segment for MQSeries shared memory (either 4, 5, 8, 9, 10, 11 or 12). Different queue managers can use different values:
QueueManager: Name=HOBBES Prefix=/var/mqm Directory=HOBBES IPCCBaseAddress=11 |
This would still require a calculation of Java heap size and MAXDATA values. An easier alternative is to use client transport instead. By the way, MQ 5.3 removes the dependency on shared memory altogether, by using EXTSHM. See the MQ Series manuals at http://www-3.ibm.com/software/integration/mqfamily/library/manualsa/ for more details.
When going beyond 1 GiB of Java heap on AIX, you are forced to take the placement of heap inside the JVM address space into your own hands. In many modern application environments, Java is only one of the various middleware technologies in use. We showed in this article how two well-known applications may need simple reconfiguration steps, to ensure that they do not have a conflict with the modified JVM address space. We hope that the information in this article will help you understand the how's and why's of address space tweaking.
- The DB2 Technical Support site at http://www-3.ibm.com/cgi-bin/db2www/data/db2/udb/winos2unix/support/v8pubs.d2w/en_main contains various guides relevant to this article.
- The SQL message definition quoted in this article can be found on the DB2 Technical Support site, in the "Administration" section SQL Reference Vol 2 manual.
- The commands are defined in Command Reference manual on the DB2 Technical Support site, under the "Core DB2 information" section.
- The description of EXTSHM can be found in the AIX Documentation; one of the articles describing EXTSHM is at http://publibn.boulder.ibm.com/doc_link/en_US/a_doc_lib/aixbman/prftungd/2365c78.htm and can be used as a starting point.
- The JDBC driver string is defined in Application Development Guide : Programming Client Applications under the "Application Development" on the DB2 Technical Support site.
- MQ Series manuals can be found at http://www-3.ibm.com/software/integration/mqfamily/library/manualsa/.

Punit is a Advisory Software Engineer in eServer Solution Enablement/Database Engineering group. His primary responsibilities are enabling database servers to use the latest AIX technologies. He has been working in the Database server development, performance area for last 8 years. Punit has designed and developed sample DR script for DB2 UDB v8 for AIX 5L v5.2, that is shipped in FP2 of DB2 UDB v8.1. Punit holds a Bachelors and Masters degree in Computer Science. You can contact him at punit@us.ibm.com.
Sumit Chawla works for IBM's eServer division, providing Java support to ISVs. You can contact him at sumitc@us.ibm.com.
Comments (Undergoing maintenance)





