Why is the defragmenter needed?
As in other file systems, Informix partitions may have many extents, which are contiguous blocks of storage used to store a portion of the partition. In Informix, partitions reside in dbspaces. After a period of update activity, a dbspace may get highly fragmented, and an extent of a desired length may not be available, so a shorter, non-optimal length, extent may be used. Searching for a long extent in a fragmented dbspace can take a significant amount of time since a dbspace may contain many small free extents that are searched over; this time can be substantial. Short extents cause the server to stop prematurely if large block I/O (light scans) is occurring. In addition, the operation to map logical addresses to physical addresses consumes significant CPU resources and takes longer. For maximum efficiency, frequently accessed tables should have as few extents as possible. The operations that may take longer with short extents include: finding extents, mapping logical to physical addresses, light scans, archives, read-ahead, and checkpoint flush.
Merging extents to get fewer extents (defragmentation) is beneficial to customers who have a large number of extents in a partition and are likely to run into the partition page overflow problem. The Informix 11.70 introduces extent map overflow pages so the maximum number of extents is high (32,767); however, using extent map overflow pages reduces efficiency.
Users with knowledge of Version 7.3x may remember the
oncheck -me utility was a limited-use defragmenter for 7.3x
servers. It has a number of deficiencies. Its primary deficiency is that
it takes an exclusive lock on the table. In addition,
oncheck -me cannot
recover from server crashes. With the 11.70 defragmenter, extents are
merged while the partition is being read and updated, and full crash
recovery is provided. Archives, reads, and updates can be run in parallel
with the 11.70 defragmenter.
What does the defragmenter do?
The defragmenter merges multiple smaller extents to get longer contiguous extents for a partition, thereby reducing the total number of extents in a partition. If a contiguous extent of the appropriate size is available, then a partition always ends up with a single extent after the defragmenter completes, provided that you do not add extents to the partition while the defragmenter is running. The partition could be a data or index partition. The defragment operation is an online operation. The defragmenter does not take an exclusive lock while doing the defrag operation, hence the partition is available for other operations.
Informix 11.70 also provides compress, repack, and shrink. How does the 11.70 defragmenter differ from compress, repack, and shrink?
- Compress reduces the size of data being stored, thereby creating unused space in the partition.
- Repack moves rows from the end of the partition to the empty page space present in the upper end of the partition.
- Shrink releases extents that have been emptied back to the dbspace.
So compress, repack, and shrink might be able to reduce the total number of extents for a partition, just like the defragmenter. However, compress, repack, and shrink operate at the row level; the defragmenter operates at the extent level.
How does the defragmenter work?
When the defragmenter is invoked on a partition, it first checks to ensure that the defragmenter is being run on a valid partition type and that no conflicting operations, such as compression, are running concurrently with the defragmenter. (Conflicting operations result in the defragmenter returning an error code.) The defragmenter then checks to see if defragmentation is necessary. If the partition has one or zero extents, then no defragmentation can be performed; the defragmenter returns successfully with no error code set.
When the defragment operation begins, an entry in the table accessed by the
onstat -g defragment is made. The
partition is then "frozen" (in other words, only the current number of
extents will be considered for defragmentation). Extents added while the
defragmenter is running will not be eliminated. Note that only one
defragment operation per dbspace is currently allowed.
The defragmenter then goes into a loop, wherein it first attempts to locate contiguous free space in the dbspace that covers two or more partition extents, and then allocates this space to the partition and moves the data from the old extents into the new extent. Ideally there is a large enough contiguous block of storage to cover all the extents in the partition. In this case, the result will be a single extent. Adding a chunk to the dbspace before running the defragmenter can help assure this. If the largest free extent cannot cover two or more partition extents, the defragmenter sets an error code and returns.
Free space at the end of the partition is considered as space that needs to be covered by the defragmenter. If this is undesirable, then run shrink before running the defragmenter on a partition.
Once the defragmenter finds a free extent that can cover two or more partition extents, it moves pages from the old extent into the new extent. The defragmenter uses the buffer pool for this movement in order to keep the data in sync with other uses. If buffer pool usage is a concern, then running the defragmenter at a non-peak time is recommended. After each page move, the partition is available for operations, such as updates. The "lock-out" time is kept to a bare minimum so that users cannot tell the defragmentation is running. Upon each page move, a short log record is written. When an original partition extent is emptied, the transaction is committed, and the move is finalized. Defragmenter transactions are short; there is no possibility of a long transaction occurring. Checkpoints are not blocked by the defragmenter.
The partition may contain empty pages. These usually occur at the end of the partition, and these pages are not copied.
For each defragment. a message appears in the online log at the start of defragmentation.
A crash while the defragmenter is running results in only the last
transaction being aborted. In most cases, most of the work is committed
and does not have to be redone. The defragmenter will not automatically
start up again if a defragmentation operation was in progress at the time
of crash. During fast recovery, all committed transactions are redone,
partially completed transactions are aborted, and the dbspace is restored
to a consistent state. The
oncheck -ce command
can be run on the dbspace and no 'holes' in the chunk should be present.
oncheck -pt to check on the number of
partition extents after defragmentation.
Certain special operations on a partition conflict with the 11.70 defragmenter. These conflicting operations include: compress, repack, shrink, alter, online index build, truncate, and drop. The defragmenter may not be run on the partition partition, secondary server, or a temporary table.
Defragmenter usage examples
The defragmenter may be invoked using the
execute function syntax in the dbadmin database
or through the OAT interface. Specifying a table name results in all
fragments of that table being defragmented. You may specify a partition
number (partnum), in which case, only that partnum will be defragmented.
In the 11.70 UC2 and later versions, you can specify an index name
directly. In 11.70 UC1, you can defragment indexes but you must specify
execute function syntax can be accessed
through dbaccess, or through a ESQL C or Java program.
Let's use the sales demo database tables provided with Informix as an
example table. You can create this demo database by running the script
dbaccessdemo_ud in the sqldist bin directory.
(For additional information, see the IBM Informix DB Access Users Guide
for information on how to set this up.) Unfortunately, tables created by
dbaccessdemo_ud are short and sometimes may
only have one extent in certain circumstances. To view the extents in this
table run the following command:
oncheck -pt sales_demo:sales
Three extents can be seen in this table. If there was only one extent, then
the defragmenter will not actually defragment it, but the defragmenter
will return success in this case. For example, in the
oncheck -pt output, you should see something
like the output shown in Listing 1:
Listing 1. Extent list
Extents Logical Page Physical Page Size Physical Pages 0 1:186118 8 8 8 1:186132 8 8 16 1:186168 16 16
This indicates that the sales table contains three extents. The first
extent of length 8 begins at page 186118. Let's defragment this partition
dbadmin command. In dbaccess, select
the sysadmin database. In dbaccess, run the command:
Listing 2. Defragment the partition
execute function task ('defragment', 'sales_demo:sales'); or if you prefer or need to defragment an index or one fragment of a table do: execute function task ('defragment partnum', '1049104'); You can find the partitions partnum in the oncheck -pt output. After defragmentation another oncheck -pt can be used to verify the number of extents in the partition. In this example, you should see one extent. Extents Logical Page Physical Page Size Physical Pages 0 1:186184 32 32
How to monitor the defragment operation
You can monitor the defragment operations by using the
onstat -g defragment. This
command must be run while the defragmenter is active on a partition to
show the defragment operation in progress. Since the defragment operation
may run quickly, especially for smaller tables, the defragment information
may be available for only a short period of time.
The output of the command looks like this:
Listing 3. Output of
onstat -g defragment
onstat -g defragment Defrag info id table name id dbsnum partnum status substatus errnum 62 test:informix.tab1 55 3 3145730 SEARCHING_FOR_EXTENT 0 0 62 test:informix.tab1 55 3 3145730 MERGING_EXTENTS 0 0 The different types of status are: SEARCHING_FOR_EXTENT, MERGING_EXTENTS, DEFRAG_COMPLETED, DEFRAG_FAILED In the case of DEFRAG_FAILED- it gives the errnum. In all other cases errnum is 0.
Summary of limitations
Be aware of the following limitations:
- Only one defragment operation is allowed on a partition at a time.
- Only one defragment operation is allowed on a dbspace at a time.
- The defragment operation aborts if one of the following operations is
already occurring on the same partition:
- Compress, repack, shrink
- Online index build
The Informix 11.70 server has a comprehensive set of tools that work together to optimize utilization of storage and ensure the highest access efficiency. These tools include compression, pack, shrink, and defragment. The defragmenter is one component of this tool set that is particularly useful for partitions residing in heavily updated dbspaces. If a table is used heavily and is frequently updated, then you should consider periodically running the defragmenter on it.
- IBM Informix 11.70 Information Center: Learn more about latest version of Informix
- Informix page on developerWorks: Learn more about Informix and grow your skills.
- developerWorks Information Management zone: Learn more about Information Management. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
Get products and technologies
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Participate in the discussion forum.
- Participate in developerWorks blogs and get involved in the My developerWorks community; with your personal profile and custom home page, you can tailor developerWorks to your interests and interact with other developerWorks users.
Dig deeper into Information management on developerWorks
Get samples, articles, product docs, and community resources to help build, deploy, and manage your cloud apps.
Experiment with new directions in software development.
Software development in the cloud. Register today to create a project.
Evaluate IBM software and solutions, and transform challenges into opportunities.