Understand the Informix Server V11.7 defragmenter

A new tool for optimizing your Informix server

Starting with IBM® Informix®, Version 11.70.xC1, you can defragment a table, a fragment, or an index, including system catalogs. This reduces the number of extents in which the data in your partition resides. In this article, learn what the defragmenter is, understand some considerations for its use, and see usage examples.

Nelson Robert Corcoran (ncorcora@us.ibm.com), Software Architect, IBM

Nelson Corcoran photoNelson Corcoran has been a member of the Informix Kernel Team for 12 years, working in many different areas, including smartblobs, concurrency, scheduling, file systems, and others.



Kamini Jagtiani (jagtiani@us.ibm.com), Advisory Software Engineer, IBM

Kaminit Jagtiani photoKamini Jagtiani has worked at Informix for 10 years, specializing in compression, conversion/reversion, HDR, and other database kernel features.



18 November 2010

Also available in Chinese Russian

Why is the defragmenter needed?

As in other file systems, Informix partitions may have many extents, which are contiguous blocks of storage used to store a portion of the partition. In Informix, partitions reside in dbspaces. After a period of update activity, a dbspace may get highly fragmented, and an extent of a desired length may not be available, so a shorter, non-optimal length, extent may be used. Searching for a long extent in a fragmented dbspace can take a significant amount of time since a dbspace may contain many small free extents that are searched over; this time can be substantial. Short extents cause the server to stop prematurely if large block I/O (light scans) is occurring. In addition, the operation to map logical addresses to physical addresses consumes significant CPU resources and takes longer. For maximum efficiency, frequently accessed tables should have as few extents as possible. The operations that may take longer with short extents include: finding extents, mapping logical to physical addresses, light scans, archives, read-ahead, and checkpoint flush.

Merging extents to get fewer extents (defragmentation) is beneficial to customers who have a large number of extents in a partition and are likely to run into the partition page overflow problem. The Informix 11.70 introduces extent map overflow pages so the maximum number of extents is high (32,767); however, using extent map overflow pages reduces efficiency.

Users with knowledge of Version 7.3x may remember the oncheck -me utility. The oncheck -me utility was a limited-use defragmenter for 7.3x servers. It has a number of deficiencies. Its primary deficiency is that it takes an exclusive lock on the table. In addition, oncheck -me cannot recover from server crashes. With the 11.70 defragmenter, extents are merged while the partition is being read and updated, and full crash recovery is provided. Archives, reads, and updates can be run in parallel with the 11.70 defragmenter.


What does the defragmenter do?

The defragmenter merges multiple smaller extents to get longer contiguous extents for a partition, thereby reducing the total number of extents in a partition. If a contiguous extent of the appropriate size is available, then a partition always ends up with a single extent after the defragmenter completes, provided that you do not add extents to the partition while the defragmenter is running. The partition could be a data or index partition. The defragment operation is an online operation. The defragmenter does not take an exclusive lock while doing the defrag operation, hence the partition is available for other operations.

Informix 11.70 also provides compress, repack, and shrink. How does the 11.70 defragmenter differ from compress, repack, and shrink?

  • Compress reduces the size of data being stored, thereby creating unused space in the partition.
  • Repack moves rows from the end of the partition to the empty page space present in the upper end of the partition.
  • Shrink releases extents that have been emptied back to the dbspace.

So compress, repack, and shrink might be able to reduce the total number of extents for a partition, just like the defragmenter. However, compress, repack, and shrink operate at the row level; the defragmenter operates at the extent level.


How does the defragmenter work?

When the defragmenter is invoked on a partition, it first checks to ensure that the defragmenter is being run on a valid partition type and that no conflicting operations, such as compression, are running concurrently with the defragmenter. (Conflicting operations result in the defragmenter returning an error code.) The defragmenter then checks to see if defragmentation is necessary. If the partition has one or zero extents, then no defragmentation can be performed; the defragmenter returns successfully with no error code set.

When the defragment operation begins, an entry in the table accessed by the command onstat -g defragment is made. The partition is then "frozen" (in other words, only the current number of extents will be considered for defragmentation). Extents added while the defragmenter is running will not be eliminated. Note that only one defragment operation per dbspace is currently allowed.

The defragmenter then goes into a loop, wherein it first attempts to locate contiguous free space in the dbspace that covers two or more partition extents, and then allocates this space to the partition and moves the data from the old extents into the new extent. Ideally there is a large enough contiguous block of storage to cover all the extents in the partition. In this case, the result will be a single extent. Adding a chunk to the dbspace before running the defragmenter can help assure this. If the largest free extent cannot cover two or more partition extents, the defragmenter sets an error code and returns.

Free space at the end of the partition is considered as space that needs to be covered by the defragmenter. If this is undesirable, then run shrink before running the defragmenter on a partition.

Once the defragmenter finds a free extent that can cover two or more partition extents, it moves pages from the old extent into the new extent. The defragmenter uses the buffer pool for this movement in order to keep the data in sync with other uses. If buffer pool usage is a concern, then running the defragmenter at a non-peak time is recommended. After each page move, the partition is available for operations, such as updates. The "lock-out" time is kept to a bare minimum so that users cannot tell the defragmentation is running. Upon each page move, a short log record is written. When an original partition extent is emptied, the transaction is committed, and the move is finalized. Defragmenter transactions are short; there is no possibility of a long transaction occurring. Checkpoints are not blocked by the defragmenter.

The partition may contain empty pages. These usually occur at the end of the partition, and these pages are not copied.

For each defragment. a message appears in the online log at the start of defragmentation.

A crash while the defragmenter is running results in only the last transaction being aborted. In most cases, most of the work is committed and does not have to be redone. The defragmenter will not automatically start up again if a defragmentation operation was in progress at the time of crash. During fast recovery, all committed transactions are redone, partially completed transactions are aborted, and the dbspace is restored to a consistent state. The oncheck -ce command can be run on the dbspace and no 'holes' in the chunk should be present.

Use oncheck -pt to check on the number of partition extents after defragmentation.

Certain special operations on a partition conflict with the 11.70 defragmenter. These conflicting operations include: compress, repack, shrink, alter, online index build, truncate, and drop. The defragmenter may not be run on the partition partition, secondary server, or a temporary table.


Defragmenter usage examples

The defragmenter may be invoked using the execute function syntax in the dbadmin database or through the OAT interface. Specifying a table name results in all fragments of that table being defragmented. You may specify a partition number (partnum), in which case, only that partnum will be defragmented. In the 11.70 UC2 and later versions, you can specify an index name directly. In 11.70 UC1, you can defragment indexes but you must specify their partnum.

The execute function syntax can be accessed through dbaccess, or through a ESQL C or Java program.

Let's use the sales demo database tables provided with Informix as an example table. You can create this demo database by running the script dbaccessdemo_ud in the sqldist bin directory. (For additional information, see the IBM Informix DB Access Users Guide for information on how to set this up.) Unfortunately, tables created by dbaccessdemo_ud are short and sometimes may only have one extent in certain circumstances. To view the extents in this table run the following command:

oncheck -pt sales_demo:sales

Three extents can be seen in this table. If there was only one extent, then the defragmenter will not actually defragment it, but the defragmenter will return success in this case. For example, in the oncheck -pt output, you should see something like the output shown in Listing 1:

Listing 1. Extent list
Extents                       
Logical Page     Physical Page        Size     Physical Pages
	
     0           1:186118             8        8
     8           1:186132             8        8
     16          1:186168             16       16

This indicates that the sales table contains three extents. The first extent of length 8 begins at page 186118. Let's defragment this partition using a dbadmin command. In dbaccess, select the sysadmin database. In dbaccess, run the command:

Listing 2. Defragment the partition
execute function task ('defragment', 'sales_demo:sales');

or if you prefer or need to defragment an index or one fragment of a table do:

execute function task ('defragment partnum', '1049104');

You can find the partitions partnum in the oncheck -pt output.

After defragmentation another oncheck -pt can be used to verify the number of extents in
the partition. In this example, you should see one extent.

Extents                       
Logical Page     Physical Page        Size    Physical Pages
     0           1:186184             32      32

How to monitor the defragment operation

You can monitor the defragment operations by using the onstat option onstat -g defragment. This command must be run while the defragmenter is active on a partition to show the defragment operation in progress. Since the defragment operation may run quickly, especially for smaller tables, the defragment information may be available for only a short period of time.

The output of the command looks like this:

Listing 3. Output of onstat -g defragment
onstat -g defragment

Defrag info
id  table name          id  dbsnum  partnum  status                substatus  errnum
62  test:informix.tab1  55  3       3145730  SEARCHING_FOR_EXTENT  0          0  
62  test:informix.tab1  55  3       3145730  MERGING_EXTENTS       0          0  

The different types of status are:

SEARCHING_FOR_EXTENT, MERGING_EXTENTS, DEFRAG_COMPLETED, DEFRAG_FAILED

In the case of DEFRAG_FAILED- it gives the errnum. In all other cases errnum is 0.

Summary of limitations

Be aware of the following limitations:

  • Only one defragment operation is allowed on a partition at a time.
  • Only one defragment operation is allowed on a dbspace at a time.
  • The defragment operation aborts if one of the following operations is already occurring on the same partition:
    • Truncate
    • Drop
    • Compress, repack, shrink
    • Uncompress
    • Online index build
    • Alter

Conclusion

The Informix 11.70 server has a comprehensive set of tools that work together to optimize utilization of storage and ensure the highest access efficiency. These tools include compression, pack, shrink, and defragment. The defragmenter is one component of this tool set that is particularly useful for partitions residing in heavily updated dbspaces. If a table is used heavily and is frequently updated, then you should consider periodically running the defragmenter on it.

Resources

Learn

Get products and technologies

  • Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=587820
ArticleTitle=Understand the Informix Server V11.7 defragmenter
publish-date=11182010