DB2 supports efficient data roll-out with the table partitioning feature
ALTER TABLE DETACH PARTITION, which can
separate a large chunk of data from a source table in the unit of
Figure 1 shows roll-out when both global (nonpartitioned) indexes and local (partitioned) indexes exist on the source table. The detached source table partition becomes a separate nonpartitioned table. For local indexes, the index object associated with the detached partition is carried along and the indexes within become indexes of the new target table. Global indexes, on the other hand, are not inherited by the target table. Instead, the index entries for the detached partitions are orphaned by the detach command. DB2 then automatically runs a separate background task called Asynchronous Index Cleanup (AIC) to remove orphaned keys from the global index. Meanwhile, such keys are ignored by queries that access the source table.
Figure 1. Roll-out from a table with global and local indexes
Prior to V9.7.1, roll-out with table partitioning is an offline operation. Detach waits for all existing access to the partitioned table to complete, while blocking new access from starting. However, detach may need to be performed in the same timeframe as long-running reporting queries, potentially creating contention between administrative operations performed by the DBA and query access to the table as part of the application workload.
Starting with DB2 V9.7.1, the
DETACH operation is made online for dynamic SQL
by allowing a partition to be detached while queries are concurrently
accessing the remaining partitions of the table. In fact, a reporting
workload using the Uncommitted Read (UR) isolation level could be reading
the partition being detached, and even that would not stop the
detach from proceeding concurrently.
This is accomplished by making
DETACH a two-phase operation. The first
phase, performed by the
ALTER TABLE DETACH PARTITION command, logically
detaches the partition from the table. The second phase, managed by the
system in the background without the need for user intervention, completes
the detach asynchronously after ensuring that all other existing activities
that access the partitioned table have completed. The background process
is referred to as the Asynchronous Partition Detach (APD) process. When there
are no dependent materialized query tables (MQTs), the system initiates the
second phase after the first commits. If dependent tables, such as
staging tables or refresh immediate MQTs, exist on the partitioned table,
the second phase starts automatically after these dependent tables are
From a DBA perspective, the roll-out operation progresses in the presence
of a long-running reporting workload. From an application perspective,
the long-running reporting workload is not interrupted by detach and
completes concurrently. For any access to the partitioned table that
happens after the successful completion of the
DETACH command, the
detached partition is no longer visible, and the asynchronous completion of
detach will not interrupt this new access. Of course, the new online
roll-out behavior respects the isolation level in effect. For instance, if
a Repeatable Read (RR) scanner has read the partition, but not committed
yet, detach will wait for the RR scanner to complete before it can make
the partition invisible. After the asynchronous second phase of the detach
operation has completed, the target table of detach is available to be
archived or dropped or attached to another table.
The asynchronous detach behavior is new and is the only choice in DB2 V9.7.1. It can be monitored using the following methods:
- Issue the
LIST UTILITIES SHOW DETAILcommand.
- Query the
STATUScolumn of the SYSCAT.DATAPARTITIONS catalog view for the partition in question. See the Appendix for a stored procedure that implements this approach.
The following table illustrates the sequence of activities associated with a detach operation.
Table 1. Sequence of
|1|| User-initiated, Phase 1 of |
|2|| User-initiated |
If the source table has any REFRESH IMMEDIATE MQTs defined, run
|3|| System-initiated, Phase 2 of |
Asynchronous partition detach; no user intervention needed.
|4|| System-initiated |
If there are any global indexes, asynchronous index cleanup (AIC) will be started by DB2 automatically to remove index entries for the detached partition.
Figure 2 illustrates the two-phase detach behavior and query activities allowed on the table during the entire period.
In phase 1, the partition is logically detached from the source table. Catalogs and other metadata are updated to reflect this, but the table is not physically removed from the source table. Dynamic queries using UR isolation may continue to access all partitions of the source table, including the detached partition, while dynamic queries at other isolation levels may access all remaining partitions of the source table. After detach commits, new queries will no longer see the detached partition.
Phase 2 begins after all queries that started before the detach have completed and any dependent MQTs have been refreshed by the user. In this phase, the detached partition is physically de-linked from the source table by the APD task and becomes its own separate nonpartitioned table.
Figure 2. The two-phase detach and query activities during the period
While V9.7.1 improves the availability of the partitioned table during the
roll-out operation, the target table of detach is not available until the
APD for the detached partition completes. The
sample SQL PL procedure (
waitForDetach), found in the Appendix at the end
of this article, provides a convenient way to script the post-detach
processing for the target table, as described below. The procedure queries
the partition state in the STATUS column of SYSCAT.DATAPARTITIONS catalog
view and does not return until the APD for the
partition has completed.
The interface to
waitForDetach looks like the following.
procedure waitForDetach (OUT returnCode integer, OUT msg varchar(128), IN schemaName varchar(128), IN tableName varchar(128), IN dataPartID int DEFAULT -1) -- Output: -- returnCode - 0 indicates detach completed successfully; -- -1 indicates an error -- msg - Error details -- Input: -- schemaName - Table schema -- tableName - Table name -- dataPartID - (optional) dpid of partition to wait for; -- by default, wait for all detach operations on the table -- to complete.
The remainder of this section discusses scenarios where
waitForDetach may be used.
Access to target table of detach
Before the target table can be accessed after a detach, the APD operation must complete; otherwise, error SQL20285N with reason code 2 will be encountered.
Scenario 1: Target table of detach will be dropped or archived
Since the target table of detach is not available until the detach
operation is complete, one may wish to automate waiting for detach to
complete before they drop or archive the target table of detach. The
ALTER TABLE DETACH PARTITION command initiates the roll-out process.
After that command commits, a subsequent call to
waitForDetach will return
once the detach operation has fully completed. Thus, it would wait for the
completion of any existing queries on the range partitioned table at the
time of the detach, and for the asynchronous partition detach phase. Then
the target table of the detach can be dropped or archived.
Example: In the following example, the DBA for the warehouse detaches a partition every month, and the availability of the partitioned table is the most important to the warehouse. The target table of detach is archived and subsequently dropped.
Listing 2. DBA for warehouse detaches a partition
ALTER TABLE sales DETACH PARTITION jan2009 INTO jan2009_old; COMMIT; CALL waitForDetach (?, ?, 'DWSCHEMA', 'SALES'); BACKUP DB DWDB TABLESPACE (jan2009_tbsp) to /archpath; DROP TABLE jan2009_old; COMMIT;
Scenario 2: Target table of detach will be reattached
To perform certain operations at a partition level (for example,
TRUNCATE), the DBA may detach a partition, perform the operation on the
target table of detach, then reattach the partition. In this scenario,
TRUNCATE script will need to call
waitForDetach for the detach to
complete before it can invoke
TRUNCATE on the target table of detach. The
second phase of detach will wait for any existing queries as usual, though
it is likely that this kind of maintenance occurs in a window where there
are no such queries and the APD can complete quickly.
Example: In the following example, the DBA for the warehouse wishes
to truncate a partition. Since the
TRUNCATE command operates at a
table level, the DBA achieves partition-level
TRUNCATE by detaching the partition, truncating the target table, and reattaching the partition back
to the original partitioned table. In this example, the DBA first queries
the catalog to determine partition identifier for the partition to be
truncated, then uses that in the call to
waitForDetach to wait for
just that specific partition. This use of
waitForDetach could be
advantageous in case other partitions were also recently detached.
Listing 3. DBA queries catalog to determine partition identifier for partition to be truncated
select DATAPARTITIONID from syscat.datapartitions where tabschema = 'DWSCHEMA' and tabname = 'CLAIMS' and DATAPARTITIONNAME = 'YEAR1'; -- assume partition id "3" was returned ALTER TABLE claims DETACH PARTITION year1 INTO year1_table; COMMIT; -- wait for the above detach to complete CALL waitForDetach (?, ?, 'DWSCHEMA', 'CLAIMS', 3); TRUNCATE TABLE year1_table IMMEDIATE; ALTER TABLE claims ATTACH PARTITION year1 STARTING FROM 200101 ENDING AT 200112 FROM year1_table; SET INTEGRITY FOR claims IMMEDIATE CHECKED;
Access to source table of detach
Scenario: DDL access to the range partitioned table
Any DDL (including subsequent
ATTACH operations) that accesses a
range partitioned table in a separate transaction immediately after a
detach operation may potentially interfere with the completion of
asynchronous detach, which in itself is akin to a DDL operation. In these
cases, it may be beneficial to wait for the detach operation to fully
complete before attempting the next DDL on the range partitioned table.
Example: In the following example, the DBA for the warehouse detaches an old partition and then adds a new empty partition to the same table.
Listing 4. DBA for warehouse detaches old partition and adds an empty partition
ALTER TABLE claims DETACH PARTITION year1 INTO year1_table; COMMIT; -- wait for the above detach to complete CALL waitForDetach (?, ?, 'DWSCHEMA', 'CLAIMS'); ALTER TABLE claims ADD PARTITION year12 STARTING 201201 ENDING 201212;
Frequently asked questions
This section discusses some frequently asked questions related to using
ALTER TABLE DETACH PARTITION to roll out data from the database.
Where did the partition and its indexes go after being detached?
Detach is a no-data-movement operation. The table data and local indexes of the partition being detached will not physically move but will continue to reside in the exact same table spaces. Only the identity information and the logical relationship with the source partitioned table will be modified, resulting in a few changes in the system catalog and other object metadata. Global indexes remain with the source table, and any keys associated with the detached partition will be removed from them by the AIC task.
Which table space will the indexes on the target table use?
The index table space specified for the partition when it was part of the source table will continue being used as the index table space for indexes on the target table. If no index table space was specified for that partition, the index table space will be the same as that for the partition data.
What indexes will be inherited by the target table?
All the partitioned (local) indexes defined on the source table will be inherited by the target table. These include those system generated indexes being used for primary or unique key constraints, for XML support, and for MDC table support. Among nonpartitioned (global) indexes, only MDC block indexes (which are nonpartitioned if the MDC table was created before V97 FP1) and XML column path indexes are inherited by the target table. No other nonpartitioned indexes will be inherited.
Will any constraints on the source table be inherited by the target
No. Constraints on the source table will not be inherited by the target table after
DETACH. One should be able to add similar
constraints back using the
ALTER TABLE command when necessary. If any of
the constraints were enforced with partitioned indexes before the detach,
ALTER TABLE command will reuse the inherited indexes and avoid
spending time to create new ones.
Do I have to run
SET INTEGRITY on the source or target after
No, you do not need to run
SET INTEGRITY on either the source
table or the target table after
DETACH. It's a common misunderstanding.
When there are refresh immediate MQTs defined on the table, these MQTs
will be in set integrity pending state after detach. You need to use
INTEGRITY to refresh these MQTs and bring them to full access state.
How do I know if a partition has been detached successfully?
There are several options:
- Query the catalog view SYSCAT.DATAPARTITIONS to check if the partition is still part of the source table.
LIST UTILITIES SHOW DETAILto see if the asynchronous partition detach task activity has completed.
- Use the stored procedure code provided in this article, which is essentially based on the first method.
Can I detach multiple partitions from the same table in the same
Yes, but the best practice is to detach one partition at a time and commit right away, then wait for the APD task to complete before issuing the next
DETACH. This is
to avoid the potential conflicts caused by multiple APDs, one per
DETACH, which may compete for system catalog locks and other resources. The result
of such conflicts is that only one winner can proceed, where others will be
put back on the waiting queue until next retry cycle, which, ultimately,
takes longer than following the best practice.
Are the stats on the partition inherited after
Some table and index statistics from the source table will be inherited by the target table during a detach operation, but not all. We recommend that users perform
RUNSTATS for the source table after a detach operation.
If the target table of a detach will be retained and queried in the
future, we recommend running
RUNSTATS for that table as well.
Are there any restrictions for
DETACH when there are referential
constraint relationships existing on the tables involved?
Yes. Detaching a partition from a parent table in the RI relationship is not allowed. One can drop the specific constraints or convert them from enforced to informational constraints to get around this restriction. However, caution must be exercised when performing such a workaround so that data integrity is not compromised.
Will delete triggers be fired during
Is there an option to force the old
DETACH behavior (i.e., the
completely synchronous detach behavior as in versions prior to DB2 V9.7
No. DB2 doesn't support such an option. However, one can use the stored procedure (
waitForDetach) provided in this article or come up
with one's own script to serialize
DETACH and successive operations that
involve the target table.
How soon can I reuse the partition name after the partition is
The partition name is immediately available for reuse in the same unit of work following the return of the
DETACH command, which
will internally rename the partition to a system-generated name based on
the current time (SQLyymmddhhmmssxxx). If the partition name needs to be
reused by a SQL command from a different unit of work, that command will
have to wait for the commit of the unit of work in which
DETACH is issued.
I am using partitioned MQTs in my database. Can I detach a partition
from the MQT?
Yes. The target table will be a regular table. The
will otherwise be the same as for a non-MQT table.
This article has described concurrency improvement for the
DETACH PARTITION statement. As of DB2 V9.7.1,
DETACH allows reporting
queries to run concurrently with roll-out using table partitioning,
thereby eliminating the need for downtime when business data retention
policies are being implemented. We also discussed example scenarios in
which asynchronous detach may need to be synchronized and provided
example code for handling such synchronization.
Below is example code for a stored procedure and a related helper
function that can be used to determine when a
DETACH operation has fully
completed and the target table is accessible. Please note that this code
is intended for illustrative purposes only.
In the following sample code, the CLP statement termination character is
assumed to be
Listing 5. Sample code for
-- Command to run as a script, assuming code is in a file named 'waitForDetach.clp' -- would be: db2 -td% -vf waitForDetach.clp -- Function to check if a table exists in the database. -- -- Input: -- schemaName - Table schema -- tableName - Table name -- -- Returns: 1 if table is found, 0 if table not found -- create function tableExists (schemaName varchar(128), tableName varchar(128)) RETURNS INT RETURN case when exists ( select DATAPARTITIONID from syscat.datapartitions where tabschema = schemaName and tabname = tableName WITH UR ) then 1 else 0 END% -- Procedure to wait for detach of a single partition or all -- partitions to complete -- -- Input: -- schemaName - Table schema -- tableName - Table name -- dataPartID - (optional) dpid of partition to wait for; -- by default, wait for all detach operations on the table -- to complete. -- -- Output: -- returnCode - 0 indicates detach completed successfully; -- -1 indicates an error -- msg - Error details -- create procedure waitForDetach (OUT returnCode integer, OUT msg varchar(128), IN schemaName varchar(128), IN tableName varchar(128), IN dataPartID int DEFAULT -1) specific waitForDetach language SQL BEGIN declare dpid int; declare dpstate char; declare done boolean default FALSE; declare tabNotFound boolean default FALSE; declare curClosed boolean default FALSE; declare oldLockTimeout INTEGER; declare allDetachCheck cursor for select DATAPARTITIONID, STATUS from syscat.datapartitions where tabschema = schemaName and tabname = tableName and (status = 'L' OR status = 'D') WITH CS; declare oneDetachCheck cursor for select DATAPARTITIONID, STATUS from syscat.datapartitions where tabschema = schemaName and tabname = tableName and DATAPARTITIONID = dataPartID WITH CS; -- Add signal handlers for errors that you consider non-fatal here -- example for no row found and lock timeout shown below declare continue handler for NOT FOUND BEGIN -- If no record was found there is nothing to wait for, so we are done set done = TRUE; END; declare continue handler for SQLSTATE '40001' BEGIN -- on a lock timeout we are not done, we want to loop again set done = FALSE; -- the cursor will be closed as part of the rollback during the lock timeout set curClosed = TRUE; END; -- initialize return code to 0 set returnCode = 0; -- save current lock timeout value values current lock timeout into oldLockTimeout; set current lock timeout -1; -- if table does not exist in syscat.datapartitions, return error IF tableExists (schemaName, tableName) = 0 THEN set msg = 'Table ' || schemaName || '.' || tableName || ' not found'; set returnCode = -1; GOTO exit; END IF; WHILE done = FALSE DO set curClosed = FALSE; IF dataPartID <> -1 THEN open oneDetachCheck; fetch oneDetachCheck into dpid, dpstate; -- two cases here: -- (i) detach has already completed hence partition entry not -- found in catalogs (indicated by done = TRUE, handled later) -- (ii) detach in progress, partition state should not be visible IF done <> TRUE AND (dpstate = '' OR dpstate = 'A') THEN set msg = 'Cannot waitForDetach if DETACH was not issued on ' || 'DATAPARTITIONID ' || dataPartID; set returnCode = -1; GOTO exit; END IF; IF curClosed <> TRUE THEN close oneDetachCheck; END IF; ELSE open allDetachCheck; fetch allDetachCheck into dpid, dpstate; IF curClosed <> TRUE THEN close allDetachCheck; END IF; END IF; END WHILE; IF dataPartID <> -1 THEN set msg = 'DETACH completed on DATAPARTITIONID ' || dataPartID; ELSE set msg = 'All detach operations completed on table ' || schemaName || '.' || tableName; END IF; exit: -- restore lock timeout value set current lock timeout oldLockTimeout; RETURN returnCode; END%
- "Unleash the power of table partitioning in your DB2 warehouse" discusses the application of the DB2 for Linux, UNIX, and Windows Table Partitioning feature in a data warehouse environment.
- "Best practices for DB2 for Linux, UNIX, and Windows" comprises the regularly updated set of best-practice papers for DB2 for Linux, UNIX, and Windows.
- Learn more about Information Management at the developerWorks Information Management zone. Find technical documentation, how-to articles, education, downloads, product information, and more.
- Stay current with developerWorks technical events and webcasts.
- Follow developerWorks on Twitter.
Get products and technologies
- Build your next development project with IBM trial software, available for download directly from developerWorks.
- Participate in the discussion forum.
- Check out the developerWorks blogs and get involved in the developerWorks community.
Dig deeper into Information management on developerWorks
Experiment with new directions in software development.
Read and subscribe for the best and latest technical info to help you deal with your development challenges.
Software development in the cloud. Register today and get free private projects through 2014.
Evaluate IBM software and solutions, and transform challenges into opportunities.