Controlling log size for SQL operations

The amount of logging performed for applications depends on how much data is changed.

About this task

Certain SQL statements are quite powerful, and a single statement can sometimes modify a large amount of data. Such statements include:

INSERT with a fullselect
A large amount of data can be inserted into table or view, depending on the result of the query.
Mass deletes and mass updates (except for deleting all rows for a table in a segmented (non-UTS) table space or universal table space)

For non-segmented table spaces, each of these statements results in the logging of all database data that changes. For example, if a table contains 200 million rows of data, that data and control information are logged if all of the rows of a table are deleted with the SQL DELETE statement. No intermediate commit points are taken during this operation.

For segmented (non-UTS) table spaces and universal table spaces, a mass delete results in the logging of the data of the deleted records when any of the following conditions are true:

  • The table is the parent table of a referential constraint.
  • The table is defined as DATA CAPTURE(CHANGES), which causes additional information to be logged for certain SQL operations.
  • A delete trigger is defined on the table.
TRUNCATE TABLE
Essentially a mass-delete that does not activate delete triggers
Data definition statements
Logging for the entire database descriptor for which the change was made. For very large DBDs, this can be a significant amount of logging.
Modification to rows that contain LOB or XML data
Data in tables that contain LOB or XML data.

Procedure

To control the use of log space by powerful SQL statements:

  • For mass delete operations, consider using segmented (non-UTS) table spaces or universal table spaces. If those table space types are not an option, and no triggers exist on the table or your application can safely ignore any triggers on the table, create one table per table space, and use TRUNCATE.
  • For inserting a large amount of data, instead of using an SQL INSERT statement, use the LOAD utility with LOG(NO) and take an inline copy.
  • For updates, consider your workload when defining a table's columns.
    The amount of data that is logged for update depends on whether the row contains all fixed-length columns or not. For fixed-length non-compressed rows, changes are logged only from the beginning of the first updated column to the end of the last updated column. Consequently, you should keep frequently updated columns close to each other to reduce log quantities.

    For varying-length rows (A varying-length row contains one or more varying-length columns), data is logged from the first changed byte to the end of the last updated column if the length is not changed. However, in cases where the length changes, which are more common, the data is logged from the first changed byte to the end of the row.

    To determine whether a workload is read-intensive or update-intensive, check the log data rate. You can find the rate in the LOG RATE FOR 1 LOG (MB/SEC) field in the log statistics. Determine the average log size and divide that by 60 to get the average number of log bytes written per second.

    • If you log less than 2 MB per second, the workload is read-intensive.
    • If you log more than 20 MB per second, the workload is update-intensive.
    • From 2–20 MB per second, the workload is neither read- or update-intensive.
  • If you have many data definition statements (CREATE, ALTER, DROP) for a single database, issue them within a single unit of work to avoid logging the changed DBD for each data definition statement.
    However, be aware that the DBD is locked until the COMMIT is issued.
  • Use the NOT LOGGED option for any LOB or XML data that requires frequent updating and for which the trade off of non-recoverability of LOB or XML data from the log is acceptable. (You can still use the RECOVER utility on LOB or XML table spaces to recover control information that ensures physical consistency of the LOB or XML table space.)
    Because LOB and XML table spaces defined as NOT LOGGED are not recoverable from the Db2 log, you should make a recovery plan for that data. For example, if you run batch updates, be sure to take an image copy after the updates are complete.
  • For data that is modified infrequently, except during certain periods such as year-end processing, when frequent or large changes to the data occur over a short time:
    1. Make an image copy of the data.
    2. Alter the table space to NOT LOGGED.
    3. Make the massive changes.
    4. Stop other activities that update the data.
    5. Make an image copy of the data.
    6. Alter the table space to LOGGED.
  • For changes to tables, such as materialized query tables, that contain propagated data, use the NOT LOGGED option because the data exists elsewhere.
    If the data becomes damaged you can refresh the entire table from the original source.