Managing space and page size for hash-organized tables Start of change(deprecated)End of change

By managing the table space size of your hash-organized tables you can reduce the cost of accessing the index for rows that overflow the fixed hash space. Hash-organized table spaces are deprecated and likely to be unsupported in the future.

About this task

Deprecated function:

FL 504 Hash-organized tables are deprecated. Beginning in Db2 12, packages that are bound with APPLCOMPAT(V12R1M504) or higher cannot create hash-organized tables or alter existing tables to use hash-organization. Existing hash-organized tables remain supported, but they are likely to be unsupported in the future.

Begin general-use programming interface information.When a table is organized for hash access, Db2 uses a hash calculation to determine which data page to place each row of data into. It also uses a separate calculation to determine where in the page to place the row. That process is somewhat, but not completely, random. Statistical variation means that sometimes too many rows are placed on the same page by the hash calculation. Rows that do not fit on the page are stored outside of the hash space. Entries are added to a hash overflow index so that the rows can be located.

Procedure

To minimize the number of hash overflows create the table space for the hash-organized table explicitly, and use the following approaches:

  • Define the fixed hash space to be larger than the data that it contains to minimize the number of overflows.
    You can specify the AUTOESTSPACE(YES) option when you run the REORG TABLESPACE utility for the hash-organized table. When you specify that option, Db2 determines a size that results in a percentage of overflows of approximately five percent, if the amount of extra space that is needed does not exceed 50% of the size of the data. When you specify AUTOESTSPACE(YES) Db2 also uses the specified PCTFREE value for the table space when it determines how much space to allocate.
  • Choose a page size the enables enough rows to fit on a page.
    When few rows fit on a page, hash access requires more extra space to perform well. Generally, 20 rows per page yields an acceptable amount of overflows.
  • Specify the following other options for table spaces that are organized by hash:
    • Specify DEFINE YES to ensure that the fixed table space is allocated successfully before the first access.
    • Specify NUMPARTS for partition-by-range (UTS) table spaces. Partition-by-growth table spaces automatically calculate NUMPARTS based on the specified HASH SPACE and DSSIZE.
    • Specify a DSSIZE value large enough to fit the hash space for each partition. Partition-by-growth table spaces use DSSIZE to validate the hash space for each partition.
    • Specify -1 for the value of PRIQTY. Db2 uses the default value for primary space allocation.
    • Do not use the MAXROWS option with tables that are organized for hash access. It is not used in the fixed hash space. However, the value when specified applies normally in the hash overflow area.

    The following values cannot be specified for tables that are organized for hash access:

    • APPEND YES
    • MEMBER CLUSTER
    • FREEPAGE

What to do next

To learn more about how you might achieve fewer overflows, or reduce the amount of extra space that is needed, see Fine-tuning hash space and page size.End general-use programming interface information.

Fine-tuning hash space and page size

You can fine-tune the amount of overhead from hash overflow by adjusting two attributes of the hash-organized table space: The amount of additional space, and the number of rows per data page.

Before you begin

Begin general-use programming interface information.Before trying to manually tune the size of your hash space, determine whether automatic space estimation yields acceptable performance:
  1. Reorganize the hash-organized table space and specify the AUTOESTSPACE(YES) option.
  2. Monitor the hash access to determine whether the performance is acceptable.

About this task

The process that Db2 uses to determine the placement of a row in a hash-organized table space is somewhat, but not completely, random. Consequently, statistical variation means that sometimes too many rows are placed on the same page When that happens, rows that do not fit on the page are stored outside of the hash space, and an entry is added to a hash overflow index so that the row can be located. When that happens an overflow occurs. Rows that do not fit on the determined page are stored outside of the hash space, and an entry is added to a hash overflow index so that the row can be located.

By allocating additional space for table space, you can reduce the number of these overflow situations. The number of rows that fit on a page also has a significant impact on the amount of overflows.

The following table shows estimated multipliers for determining the amount of additional space to allocate, according to the number of rows that fit on page, to achieve a certain estimated percentage of overflows.

Table 1. Estimated size multipliers for fixed-size hash-organized table spaces by percentage of overflows.
Rows per page 20% overflow 10% overflow 5% overflow 2.5% overflow 1.0% overflow 0.5% overflow 0.1% overflow 0.01% overflow 0.0001% overflow
1 2.18 4.69 9.69 19.7 49.7 99.7 499.7 4999.7 49999.7
2 1.3 2.08 3.16 4.68 7.69 11.07 25.35 81.18 257.73
3 1.1 1.6 2.2 2.95 4.23 5.49 9.82 21.83 47.7
4 1.02 1.4 1.83 2.33 3.13 3.87 6.13 11.43 20.85
5 0.97 1.29 1.64 2.02 2.61 3.12 4.61 7.75 12.71
7 0.92 1.18 1.44 1.71 2.1 2.43 3.31 4.94 7.19
10 0.88 1.1 1.3 1.5 1.77 1.99 2.54 3.47 4.63
15 0.86 1.04 1.19 1.34 1.54 1.69 2.04 2.59 3.22
20 0.85 1.01 1.14 1.27 1.42 1.54 1.81 2.21 2.64
30 0.84 0.98 1.09 1.19 1.31 1.39 1.59 1.86 2.14
40 0.83 0.96 1.06 1.15 1.25 1.32 1.47 1.69 1.9
50 0.83 0.95 1.04 1.12 1.21 1.27 1.41 1.58 1.76
100 0.83 0.94 1.01 1.07 1.13 1.17 1.26 1.37 1.47
150 0.83 0.93 1 1.05 1.1 1.13 1.2 1.29 1.36
200 0.83 0.93 0.99 1.04 1.08 1.11 1.17 1.24 1.31
250 0.83 0.93 0.99 1.03 1.07 1.1 1.15 1.22 1.27

Procedure

To fine tune the size of your hash space:

  • Ensure that a sufficient amount of additional space is provided.
    For example, if 20 rows fit on a data page, and you want approximately 5% of the rows to be hash overflows, you would multiply the size of your data by 1.14 (14% overhead) to determine the size of your hash space. Note that additional space is actually used in that case because 5% of the rows overflow, meaning a total space overhead of 19%
  • Ensure that an appropriate number of rows fit on each page.
    For example, a table that contains 1500-byte rows in 4KB pages has two rows on each page. To achieve a overflow level of 5% would require you to specify a fixed hash space size that is 3.16 times larger than the size of the data in the table. However, if you convert the table to use 32KB pages, each page would contain 16 rows, meaning that a fixed size for the hash space between 1.14 and 1.19 times larger would be sufficient to achieve 5% overflows.End general-use programming interface information.