Guidelines for data reorganization

You must consider several factors before you reorganize your data.

You must run the REORG utility only when you determine that data needs to be reorganized. If application performance is not degraded, you might not need to reorganize data. Even when some statistics indicate that data is becoming unorganized, a REORG utility job is not always required, unless the lack of organization exceeds a specified threshold.

In the following situations, data reorganization is advisable:

Data is in REORG-pending status

When table spaces or partitions are in REORG-pending (REORP) status, you cannot select, insert, update, or delete data. You must reorganize table spaces or partitions when REORG-pending status imposes this restriction.

You can use the DISPLAY DATABASE RESTRICT command to identify the table spaces and partitions that need to be reorganized.

Data is in advisory REORG-pending status

After you change table or index definitions, consider reorganizing data to improve performance. After you change data types or column lengths by using ALTER TABLE statements, Db2 places the table space that contains the modified data in advisory REORG-pending (AREO*) status. The table space is in AREO* status because the existing data is not immediately converted to its new definition. Reorganizing the table space prevents possible performance degradation.

Recommendation: When data is in REORG-pending or AREO* status, use the REORG utility with the SCOPE PENDING option to automatically reorganize partitions. With this option, you do not need to first identify which partitions need to be reorganized or to customize the REORG control statement.

Data is skewed

When you use partitioned table spaces, you might sometimes find that data is out of balance, or skewed. When data is skewed, performance can be negatively affected because of contention for I/O and other resources. You might also have a situation in which some partitions are approaching their maximum size, and other partitions have excess space.

You can correct the skewed data by redistributing the data across partitions.For more information, see Redistributing data in partitioned table spaces.

Data is unorganized or fragmented

When data becomes unorganized or fragmented, you need to consider reorganizing your table spaces and index spaces.

You need to consider the following situations to evaluate when data reorganization is necessary:

Unused space

In simple table spaces, dropped tables use space that is not reclaimed until you reorganize the table space. Consider running REORG if the percentage of space that is occupied by rows of dropped tables is greater than 10%. The PERCDROP value in the SYSIBM.SYSTABLEPART catalog table identifies this percentage.

Page gaps
Indexes can have multiple levels of pages. An index page that contains pairs of keys and identifiers and that points directly to data is called a leaf page.

Deleting index keys can result in page gaps within leaf pages. Gaps can also occur when Db2 inserts an index key that does not fit onto a full page. Sometimes Db2 detects sequential inserts and splits the index pages asymmetrically to improve space usage and reduce split processing. You can improve performance even more by choosing the appropriate page size for index pages. If page gaps occur, consider running the REORG utility.

The LEAFNEAR and LEAFFAR columns of SYSIBM.SYSINDEXPART store information about the organization of physical leaf pages by indicating the number of pages that are not in an optimal position.

I/O activity
You can determine when I/O activity on a table space might be increasing. A large number (relative to previous values that you received) for the NEARINDREF or the FARINDREF option indicates an increase in I/O activity. Consider a reorganization when the sum of NEARINDREF and FARINDREF values exceeds 10%.

The NEARINDREF and FARINDREF values in the SYSIBM.SYSTABLEPART and SYSIBM.SYSTABLEPART_HIST catalog tables identify the number of reallocated rows.

Recommendation: When increased I/O activity occurs, use a non-zero value for the PCTFREE clause of the table space definition. The PCTFREE clause specifies what percentage of each page in a table space or index is left free when data is loaded or reorganized. PCTFREE is a better choice than FREEPAGE.
Clustering
You can determine if clustering is becoming degraded. Clustering becomes degraded when the rows of a table are not stored in the same order as the entries of its clustering index. A large value for the FAROFFPOSF option might indicate poor clustering. Reorganizing the table space can improve performance. Although less critical, a large value for the NEAROFFPOSF option can also indicate that reorganization might improve performance. The FAROFFPOSF and NEAROFFPOSF values in the SYSIBM.SYSINDEXPART and SYSIBM.SYSINDEXPART_HIST catalog tables identify the number of rows that are far from and near to optimal position.
REORG thresholds
You can use the RUNSTATS, REORG, REBUILD INDEX, and LOAD utilities to collect statistics that describe the fragmentation of table spaces and indexes. These statistics can help you decide when to run the REORG utility to improve performance or reclaim space.

You can set up your REORG job in accordance with threshold limits that you set for relevant statistics from the catalog. The OFFPOSLIMIT and INDREFLIMIT options specify when to run REORG on table spaces. When a REORG job runs with these options, it queries the catalog for relevant statistics. The REORG job does not occur unless one of the thresholds that you specify is exceeded. You can also specify the REPORTONLY option to produce a report that tells you whether a REORG job is recommended.