Skip to main content

Sensible Sanitation -- Understanding the IBM Java Garbage Collector, Part 1: Object allocation

Sam Borman (sambo@uk.ibm.com), Software Engineer, IBM, Software Group
Sam Borman joined IBM in 1984 after working at seven other companies as a programmer and systems programmer in the UK, New Zealand, and France. He worked as a developer, and later as a development manager, in CICS. In 1990 he returned to a technical career working on CICSPlex/SM and later on DirectTalk. In 1999 he joined the Java Technology Centre where he took ownership of Garbage Collection. You can contact Sam Borman at sambo@uk.ibm.com.

Summary:  Find out how objects are allocated in the Java heap for garbage collection. This article describes the layout of an object and examines some of the data areas, such as the heap and the free list. The author also discusses direct allocation from the heap and thread local allocation, and gives some recommendations for controlling the heap size. This article is the first of three in a series on the IBM Java Garbage Collector (GC), a storage manager for the IBM Java development kits and runtime environments. The series covers storage areas used by GC; object allocation; garbage collection; how external interfaces work; and verbosegc and other command-line parameters. Information in this article reflects the Java 1.2.2 through 1.3.1 versions.

Date:  10 Aug 2002
Level:  Intermediate
Activity:  6106 views

Overview

Calls from other sub-components drives object allocation to one of the allocation interfaces, such as stCacheAlloc, stAllocObject, stAllocArray, or stAllocClass. The allocation interfaces allocate a given amount of storage from the heap, but have different parameters and semantics. The stCacheAlloc routine is specifically designed to deliver optimal allocation performance for small objects. Objects are allocated directly from a thread local allocation buffer that the thread has previously allocated from the heap. A new object is allocated from the end of this cache without the need to grab the heap lock, so it is very efficient. Objects allocated through the stAllocObject and stAllocArray interfaces will, if small enough (currently 512 bytes), also be allocated from the cache.

Data Areas

This section discusses objects and the heap, or storage, aspect of GC.

An object

Figure 1 shows the layout of an object on the heap. The parts of the object are explained below the figure.


Figure 1. Object layout
An Object
size + flags
The size + flags slot is 4 bytes on 32 bit architecture and 8 bytes on 64 bit architecture. The main purpose of this slot is to contain the length of the object. As all objects start on an 8 byte boundary, and the size is divisible by 8, the bottom 3 bits are not used for the size; we use them for flags to indicate different states of the object. As the size of objects is limited, the top 2 bits can be used for flags. Note that the mptr, not this slot, is grained on an 8 byte boundary. The flags are as follows:
  • Bit 1 is the swapped bit that is only used during compaction. Bit 1 is also the NotYetScanned bit that is only used during mark stack overflow. In addition, Bit 1 is also the multi-pinned bit used to indicate that this object has been pinned multiple times. During a GC, the multi-pinned bit will be removed and restored to allow other uses of this overloaded bit.
  • Bit 2 is the dosed bit. The dosed bit is set on if the object is referenced from the stack or registers. The object cannot be moved in this GC cycle; we cannot fix up the reference as it may not be a real reference but simply an integer that happens to have the same value as an object on the heap.
  • Bit 3 is the pinned bit. Pinned objects usually cannot be moved because they are referenced from outside the heap. Examples include Thread and ClassClass objects.
  • Bit 31 in 32-bit architecture, or bit 63 in 64-bit architecture, is the flat locked contention (flc) bit used by the locking (LK) component.
  • Bit 32 in 32-bit architecture, or bit 64 in 64-bit architecture, is the hashed bit used to denote an object that has returned its hashed value. This is required, because the hash value is the address of the object and you need to maintain this if you move the object.
mptr
The mptr slot is 4 bytes on 32-bit architecture and 8 bytes on 64-bit architecture. The mptr slot, not the size + flags, is grained on an 8 byte boundary. The mptr has one of two functions:
  • If the mptr slot is not an array, the mptr points to the method block, from which you can get to the class block. The class block tells you which class an object is an instantiation of. The class loader allocates the method block and class block, neither of which are in the heap.
  • If the mptr slot is an array, the mptr contains the number of array entries in this object.
locknflags
This slot is 4 bytes on 32-bit architecture and 8 bytes on 64-bit architecture, although only the lower 4 bytes are used. Used mainly to contain data for the LK component when locking, the locknflags also has some flags. The following flags are of interest:
  • Bit 2 is the array flag. If this bit is on, the object is an array and the mptr field will contain a count of the number of elements in the array.
  • Bit 4 is the hashed and moved bit. If this bit is on, it tells us that this object has been moved after it was hashed, and the hash value is in the last slot of the object.
Object data
This is where the object data starts, and the layout is object-dependent.

The size + flags, mptr, and locknflags are sometimes collectively known as the header.

The heap

The heap is a contiguous piece of storage obtained from the operating system at JVM initialization. Figure 2 shows the layout of the heap.


Figure 2. The heap
The Heap

Heapbase is the address of the start of the heap, and heaptop is the address of the end of the heap. Heaplimit is the address of the top of the currently used part of the heap and can expand and shrink. The size from heapbase to heaptop is controlled by the -Xmx option. If this option is not specified, the default will apply as follows:

Java Release 1.2.2 and 1.3.0Release 1.3.1
For Windows platforms, half the real storage with a minimum of 16M and a maximum of 2G-1For Windows platforms, half the real storage with a minimum of 16M and a maximum of 2G-1
For all other platforms, 64MFor OS/390 and AIX platforms, 64M
For Linux platforms, half the real storage with a minimum of 16M and a maximum of 512M-1

The initial size from heapbase to heaplimit is controlled by the -Xms option. If this option is not specified, the default will apply as follows:

Java Release 1.2.2 and 1.3.0Release 1.3.1
For Windows platforms, 4MFor Windows, AIX, and Linux platforms, 4M
For all other platforms, 1MFor OS/390, 1M

Setting the heap size

For the majority of applications, the default settings will work well. The heap expands until it reaches a steady state. The heap then remains in this state, which should give a heap occupancy (or the amount of live data on the heap at any given time) of 70%. At this level, the frequency of GCs and the pause time of GCs should be at an acceptable level.

For some applications, the default settings might not give the best results. Below are some problems that may occur, and some suggested actions to take. You can use verbosegc to help monitor the heap.

ProblemSuggested action
Frequency of GCs is too high until the heap reaches a steady stateSolution: use verbosegc to determine the size of the heap at a steady state and set -Xms to this value.
The heap is fully expanded and the occupancy level is greater than 70%Solution: Increase the -Xmx value so the heap is not more than 70% occupied. For best performance, try to make sure that the heap never pages. The maximum heap size should, if possible, fit in physical memory.
At 70% occupancy the frequency of GCs is too great.Solution: Change the setting of -Xminf. The default is 0.3, which will try to maintain 30% free space by expanding the heap. A setting of 0.4 increases this free space target to 40%, reducing the frequency of GCs.
Pause times are too long.Solution: Try using -Xgcpolicy:optavgpause (introduced in 1.3.1), which reduces pause times and makes them more consistent as the heap occupancy rises. There is a cost to pay in throughput. This cost varies and will be about 5%.

The following tips work well in practice:

  • Make sure the heap never pages; the maximum heap size must fit in physical memory.
  • Avoid finalizers -- you can never be guaranteed when a finalizer will run, and often they lead to problems. If you do use finalizers, try to avoid allocating objects in the finalizer method. A verbosegc trace shows if finalizers are being called.
  • Avoid compaction -- a verbosegc trace will show if compaction is occurring. Compaction is usually caused by requests for large memory allocations. Analyze requests for large memory allocations and avoid them if possible; if they are large arrays, try to split them into smaller pieces.

Alloc bits and mark bits

Figure 3 shows the heap in relation to the allocbits and markbits, which are two-bit vectors that indicate the state of objects on the heap. As all objects on the heap start on an 8-byte boundary, both vectors have 1 bit to represent 8 bytes of the heap, so each of these vectors is one 64th of the heap.


Figure 3. The heap with allocbits and markbits
The Heap with allocbits and markbits

As objects are allocated in the heap, a bit is set on in allocbits to indicate the start of the object. This process tells us where allocated objects are, but it doesn't tell us whether the object is alive or not. During the mark phase, a bit is set on in markbits to indicate the start of a live object. Figure 4 shows two objects on the heap with allocbits on for both objects. During the mark phase, Object 2 was found to be referenced, but Object 1 was not; so we have a markbit on for Object 2. Object 1 will be collected during the sweep phase.


Figure 4. Some objects in the heap
Some objects in the heap

The system heap

The system heap was introduced in Release 1.3.0, and contains only objects whose life-expectancy is the life of the JVM. The objects in this heap are the class objects for system and shareable middleware and application classes. The system heap is never garbage-collected because all objects in it will either be reachable during the lifetime of the JVM, or (in the case of shareable application classes) have been selected to be reused during the lifetime of the JVM. Figure 5 shows the system heap layout.


Figure 5. The system heap
The system heap

The system heap is a chain of noncontiguous pieces of storage. The initial size of the system heap is 128k in 32-bit architecture, and 8M in 64-bit architecture. If this fills, then the system heap will obtain another extent and chain the extents together.

The free list

Figure 6 shows the free list chain. The head of the list is in global storage and points to the first free chunk on the heap. Each chunk of free storage has a size field and a pointer that points to the next free chunk. The free chunks are in address order. The last free chunk has a NULL pointer.


Figure 6. The free list
The free list

Allocation

Two types of allocation, heap lock and cache, are discussed in this section.

Heap lock allocation

Heap lock allocation occurs when the allocation request is greater than 512 bytes, or when the allocation will not fit into the existing cache. As the name implies, heap lock allocation requires a lock and should be avoided, if possible, by using the cache. Below is some pseudo code for heap lock allocation in Release 1.3.1.


If size greater than 512 or enough space in cache
	try cacheAlloc
	return if OK
HEAP_LOCK
Do forever
	If there is a big enough chunk on freelist
		Take it
		goto Gotit
	else
		manageAllocFailure
		If any error
			goto GetOut
End do

Gotit:
Initialise object
GetOut:
HEAP_UNLOCK

First check the size of the allocation request, and if it is less than 512 or will fit in the existing cache, try to allocate using cache allocation. If you do not use cache allocation, or if the application failed to find free space, then you will get the HEAP_LOCK. The application now searches the freelist for a chunk of free storage big enough to satisfy the allocation request. If it finds free storage, it takes it and initializes the object, returning any remaining free storage to the freelist. If the remaining free storage is less than 512 bytes plus the header size (12 on 32-bit architecture and 24 on 64-bit architecture), you don't put it on the freelist. These small pieces of storage that are not on the freelist are known as dark matter, and will be recovered when the objects adjacent to them become free or when you compact the heap. If you can't find a big enough chunk of free storage, you now have an allocation failure and need to perform a GC. If the GC creates enough free storage, then you'll search the freelist again and pick up the free chunk. If GC did not find enough free storage, you will return out of memory. The HEAP_LOCK is released after we have either allocated the object or failed to find enough free space.

Hints

Hints was introduced in Release 1.3.0. In certain scenarios, in large heaps where the freelist has lots of small free spaces in an application that is allocating a lot of larger allocations, the heap lock allocation scheme has a problem. It always starts at the beginning and searches most of the long list to find a freespace big enough to satisfy an allocation. The quick freelist hint algorithm was introduced to solve this problem.

For all heap lock allocation attempts that walk the freelist, the following data is collected:

  • A search count of the number of chunks on the freelist examined before finding a freespace large enough to fit the desired allocation of size n.
  • The size of the largest freespace chunk in the freelist preceding the freespace used for allocation is also recorded. (In other words, the largest chunk not large enough to satisfy the request.)

When a freespace capable of satisfying the allocation is found and the search count is larger than 20, you should create a new active hint pointing into the freelist.

Active hints can now be used to start searching the freelist, at a point other than the beginning, depending on the size of the allocation request. Hints are dynamically updated as chunks are allocated from the freelist.

Cache allocation

Cache allocation is specifically designed to deliver optimal allocation performance for small objects. Objects are allocated directly from a thread local allocation buffer that the thread has previously allocated from the heap. A new object is allocated from the end of this cache without the need to grab the heap lock, and hence it is very efficient. The criteria for using cache allocation has changed over the releases, as follows:

  • 1.2.2 and 1.3.0 -- Use cache allocation if the size of the object is less than 384.
  • 1.3.1 -- Use cache allocation if the size of the object is less than 512, or the object will fit in the current cache block.

Figure 7 below shows a cache block on the heap.


Figure 7. A cache block on the heap
A cache block on the heap

The cache block is sometimes called a thread local heap (TLH). When allocating a TLH for a thread, we go through heap locked allocation and reserve a part of the heap that will be used exclusively by a single thread. All cache allocations can then be made into the TLH without the need for any locks. Note that the allocbit is not on for the TLH. We set allocbits for a TLH when the TLH is full or when a GC occurs. To increase the efficiency of allocating a TLH, we always take the next free chunk on the free list up to a maximum size, which has changed with releases as follows:

  • 1.2.2 and 1.3.0 -- 6K
  • 1.3.1 -- Variable from 2K to 164K, depending on the usage of the TLH

Figure 8 shows some objects that have been allocated in the TLH. Notice that objects are allocated from the back of the TLH, which can be done more efficiently than allocating from the front. There are still no allocbits set; they will be set for all objects in the TLH when the cache is full or when a GC occurs.


Figure 8. Some objects in a cache
Some objects in a cache

Stay tuned for garbage collection

The next article in this series describes how garbage collection works. It reviews the functions of the mark, sweep, compact collector, and how parallel mark and parallel sweep work. The article also discusses how the heap expands and shrinks, and describes the new Concurrent Mark collector introduced in Release 1.3.1.


Resources

About the author

Sam Borman joined IBM in 1984 after working at seven other companies as a programmer and systems programmer in the UK, New Zealand, and France. He worked as a developer, and later as a development manager, in CICS. In 1990 he returned to a technical career working on CICSPlex/SM and later on DirectTalk. In 1999 he joined the Java Technology Centre where he took ownership of Garbage Collection. You can contact Sam Borman at sambo@uk.ibm.com.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Sample IT projects
ArticleID=10211
ArticleTitle=Sensible Sanitation -- Understanding the IBM Java Garbage Collector, Part 1: Object allocation
publish-date=08102002
author1-email=sambo@uk.ibm.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).