IBM Support

Garbage Collection and Connection Leak

Technical Blog Post


Abstract

Garbage Collection and Connection Leak

Body

The other day, we encountered a problem during one of our performance tests. We observed from the database side that there is a steady increase of connections over time. Of course, we immediately suspected there is a newly introduced connection leak somewhere and started the hunt.
 
The very first thing we did was to turn on the dbconnection logger to info level. We confirmed from the maximo logs that these connections indeed are recorded in the maximo connection pool as used connections and not released. The stacktrace did confirm that there are long running connections, but the connections are obtained in the code path that we know it would be closed property either by the down stream code or closed by garbage collection. In other wordes, they are not real "connection leaks", but long running connections for whatever reason that don't go away fast enough.
 
Further analysis of the heapdump confirms that these connections are held by the maximo connection objects that are waiting for garbage collection. Maximo are done with them. But why doesn't the garbage collection do the job?
 
We got the verbose GC log and the graph looks ugly!
image
 
 
 
 
 
 
 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 Only nursery collections have been happening and only small amount of memory are collected each time.  We enabled explicit GC and combined with -Dsun.rmi.dgc.server.gcInterval=60000  to trigger some forced global collections. Sure enough, a test shows the connection leak is not happening any more. 
 
Maybe this is because the particular test is a low memory usage test and the GC policy is gencon, the JVM thinks it has enough memory and never bothered to collect the objects that the connection is linked to? But why we didn't see such GC graph in earlier tests? There must be something changed that caused the objects to stay in memory longer so that the objects are falling outside of the nursery. This isn't good. We know that in a perfect world, a MboSet should always be closed in the code when the processing of it is done. But, because a MboSet normally is shared by down stream logic, there will be cases that the programmer does not know if the MboSet should be closed immediately. In this case, if the MboSet hasn't fetched everything from the underlying result set, the connection will remain open until the garbage collection's finalizer gets to work. So, the garbage collection has to be effective, otherwise we will see larger connection footprint.  
 
The memory problem manifested as a connection problem, which in this case actually helped us in the end. With the dbconnection logs we are able to identify the code change that caused the object to stay longer in the memory. With that fixed, we see a much more friendlier GC graph:
image 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Moral of the story: Connection footprint problem can be very much related to memory problem. Relying on garbage collection to close the DB connection is risky. Code to fetch MboSet should be very careful about leaving the set not closed.
 
 

[{"Business Unit":{"code":"BU005","label":"IoT"}, "Product":{"code":"SSLKT6","label":"Maximo Asset Management"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":""}]

UID

ibm11133919