DB2 supports efficient data roll-out with the table partitioning feature
using ALTER TABLE DETACH PARTITION, which can
separate a large chunk of data from a source table in the unit of
partition.
Figure 1 shows roll-out when both global (nonpartitioned) indexes and local (partitioned) indexes exist on the source table. The detached source table partition becomes a separate nonpartitioned table. For local indexes, the index object associated with the detached partition is carried along and the indexes within become indexes of the new target table. Global indexes, on the other hand, are not inherited by the target table. Instead, the index entries for the detached partitions are orphaned by the detach command. DB2 then automatically runs a separate background task called Asynchronous Index Cleanup (AIC) to remove orphaned keys from the global index. Meanwhile, such keys are ignored by queries that access the source table.
Figure 1. Roll-out from a table with global and local indexes
Prior to V9.7.1, roll-out with table partitioning is an offline operation. Detach waits for all existing access to the partitioned table to complete, while blocking new access from starting. However, detach may need to be performed in the same timeframe as long-running reporting queries, potentially creating contention between administrative operations performed by the DBA and query access to the table as part of the application workload.
Starting with DB2 V9.7.1, the DETACH operation is made online for dynamic SQL
by allowing a partition to be detached while queries are concurrently
accessing the remaining partitions of the table. In fact, a reporting
workload using the Uncommitted Read (UR) isolation level could be reading
the partition being detached, and even that would not stop the
detach from proceeding concurrently.
This is accomplished by making DETACH a two-phase operation. The first
phase, performed by the ALTER TABLE DETACH PARTITION command, logically
detaches the partition from the table. The second phase, managed by the
system in the background without the need for user intervention, completes
the detach asynchronously after ensuring that all other existing activities
that access the partitioned table have completed. The background process
is referred to as the Asynchronous Partition Detach (APD) process. When there
are no dependent materialized query tables (MQTs), the system initiates the
second phase after the first commits. If dependent tables, such as
staging tables or refresh immediate MQTs, exist on the partitioned table,
the second phase starts automatically after these dependent tables are
refreshed via SET INTEGRITY.
From a DBA perspective, the roll-out operation progresses in the presence
of a long-running reporting workload. From an application perspective,
the long-running reporting workload is not interrupted by detach and
completes concurrently. For any access to the partitioned table that
happens after the successful completion of the DETACH command, the
detached partition is no longer visible, and the asynchronous completion of
detach will not interrupt this new access. Of course, the new online
roll-out behavior respects the isolation level in effect. For instance, if
a Repeatable Read (RR) scanner has read the partition, but not committed
yet, detach will wait for the RR scanner to complete before it can make
the partition invisible. After the asynchronous second phase of the detach
operation has completed, the target table of detach is available to be
archived or dropped or attached to another table.
The asynchronous detach behavior is new and is the only choice in DB2 V9.7.1. It can be monitored using the following methods:
- Issue the
LIST UTILITIES SHOW DETAILcommand. - Query the
STATUScolumn of the SYSCAT.DATAPARTITIONS catalog view for the partition in question. See the Appendix for a stored procedure that implements this approach.
The following table illustrates the sequence of activities associated with a detach operation.
Table 1. Sequence of
DETACH-related
activities| Sequence | Description | Example |
|---|---|---|
| 1 | User-initiated, Phase 1 of DETACH DETACH partition. Note
that this command needs to commit in order for the second phase to
start. In general, it is a good practice to commit immediately
after any DDL. | ALTER TABLE purchaseOrders DETACH PARTITION jan2010 INTO
obsolete_data; COMMIT; |
| 2 | User-initiated If the source table has any REFRESH IMMEDIATE MQTs defined, run SET INTEGRITY to refresh them. | SET INTEGRITY FOR q1_sales_mqt IMMEDIATE CHECKED; COMMIT; |
| 3 | System-initiated, Phase 2 of DETACH Asynchronous partition detach; no user intervention needed. | |
| 4 | System-initiated If there are any global indexes, asynchronous index cleanup (AIC) will be started by DB2 automatically to remove index entries for the detached partition. |
Figure 2 illustrates the two-phase detach behavior and query activities allowed on the table during the entire period.
In phase 1, the partition is logically detached from the source table. Catalogs and other metadata are updated to reflect this, but the table is not physically removed from the source table. Dynamic queries using UR isolation may continue to access all partitions of the source table, including the detached partition, while dynamic queries at other isolation levels may access all remaining partitions of the source table. After detach commits, new queries will no longer see the detached partition.
Phase 2 begins after all queries that started before the detach have completed and any dependent MQTs have been refreshed by the user. In this phase, the detached partition is physically de-linked from the source table by the APD task and becomes its own separate nonpartitioned table.
Figure 2. The two-phase detach and query activities during the period
While V9.7.1 improves the availability of the partitioned table during the
roll-out operation, the target table of detach is not available until the
APD for the detached partition completes. The
sample SQL PL procedure (waitForDetach), found in the Appendix at the end
of this article, provides a convenient way to script the post-detach
processing for the target table, as described below. The procedure queries
the partition state in the STATUS column of SYSCAT.DATAPARTITIONS catalog
view and does not return until the APD for the
partition has completed.
The interface to waitForDetach looks like the following.
Listing 1.
waitForDetach
procedure waitForDetach (OUT returnCode integer,
OUT msg varchar(128),
IN schemaName varchar(128),
IN tableName varchar(128),
IN dataPartID int DEFAULT -1)
-- Output:
-- returnCode - 0 indicates detach completed successfully;
-- -1 indicates an error
-- msg - Error details
-- Input:
-- schemaName - Table schema
-- tableName - Table name
-- dataPartID - (optional) dpid of partition to wait for;
-- by default, wait for all detach operations on the table
-- to complete.
|
The remainder of this section discusses scenarios where waitForDetach may be used.
Access to target table of detach
Before the target table can be accessed after a detach, the APD operation must complete; otherwise, error SQL20285N with reason code 2 will be encountered.
Scenario 1: Target table of detach will be dropped or archived
Since the target table of detach is not available until the detach
operation is complete, one may wish to automate waiting for detach to
complete before they drop or archive the target table of detach. The
ALTER TABLE DETACH PARTITION command initiates the roll-out process.
After that command commits, a subsequent call to waitForDetach will return
once the detach operation has fully completed. Thus, it would wait for the
completion of any existing queries on the range partitioned table at the
time of the detach, and for the asynchronous partition detach phase. Then
the target table of the detach can be dropped or archived.
Example: In the following example, the DBA for the warehouse detaches a partition every month, and the availability of the partitioned table is the most important to the warehouse. The target table of detach is archived and subsequently dropped.
Listing 2. DBA for warehouse detaches a partition
ALTER TABLE sales DETACH PARTITION jan2009 INTO jan2009_old; COMMIT; CALL waitForDetach (?, ?, 'DWSCHEMA', 'SALES'); BACKUP DB DWDB TABLESPACE (jan2009_tbsp) to /archpath; DROP TABLE jan2009_old; COMMIT; |
Scenario 2: Target table of detach will be reattached
To perform certain operations at a partition level (for example,
TRUNCATE), the DBA may detach a partition, perform the operation on the
target table of detach, then reattach the partition. In this scenario,
the TRUNCATE script will need to call waitForDetach for the detach to
complete before it can invoke TRUNCATE on the target table of detach. The
second phase of detach will wait for any existing queries as usual, though
it is likely that this kind of maintenance occurs in a window where there
are no such queries and the APD can complete quickly.
Example: In the following example, the DBA for the warehouse wishes
to truncate a partition. Since the TRUNCATE command operates at a
table level, the DBA achieves partition-level TRUNCATE by detaching the partition, truncating the target table, and reattaching the partition back
to the original partitioned table. In this example, the DBA first queries
the catalog to determine partition identifier for the partition to be
truncated, then uses that in the call to waitForDetach to wait for
just that specific partition. This use of waitForDetach could be
advantageous in case other partitions were also recently detached.
Listing 3. DBA queries catalog to determine partition identifier for partition to be truncated
select DATAPARTITIONID from syscat.datapartitions where tabschema = 'DWSCHEMA' and tabname = 'CLAIMS' and DATAPARTITIONNAME = 'YEAR1'; -- assume partition id "3" was returned ALTER TABLE claims DETACH PARTITION year1 INTO year1_table; COMMIT; -- wait for the above detach to complete CALL waitForDetach (?, ?, 'DWSCHEMA', 'CLAIMS', 3); TRUNCATE TABLE year1_table IMMEDIATE; ALTER TABLE claims ATTACH PARTITION year1 STARTING FROM 200101 ENDING AT 200112 FROM year1_table; SET INTEGRITY FOR claims IMMEDIATE CHECKED; |
Access to source table of detach
Scenario: DDL access to the range partitioned table
Any DDL (including subsequent ADD or ATTACH operations) that accesses a
range partitioned table in a separate transaction immediately after a
detach operation may potentially interfere with the completion of
asynchronous detach, which in itself is akin to a DDL operation. In these
cases, it may be beneficial to wait for the detach operation to fully
complete before attempting the next DDL on the range partitioned table.
Example: In the following example, the DBA for the warehouse detaches an old partition and then adds a new empty partition to the same table.
Listing 4. DBA for warehouse detaches old partition and adds an empty partition
ALTER TABLE claims DETACH PARTITION year1 INTO year1_table; COMMIT; -- wait for the above detach to complete CALL waitForDetach (?, ?, 'DWSCHEMA', 'CLAIMS'); ALTER TABLE claims ADD PARTITION year12 STARTING 201201 ENDING 201212; |
This section discusses some frequently asked questions related to using
ALTER TABLE DETACH PARTITION to roll out data from the database.
Where did the partition and its indexes go after being detached?
Detach is a no-data-movement operation. The table data and local
indexes of the partition being detached will not physically move but will
continue to reside in the exact same table spaces. Only the identity
information and the logical relationship with the source partitioned table
will be modified, resulting in a few changes in the system catalog and
other object metadata. Global indexes remain with the source table, and
any keys associated with the detached partition will be removed from them
by the AIC task.
Which table space will the indexes on the target table use?
The index table space specified for the partition when it was part of the
source table will continue being used as the index table space for indexes
on the target table. If no index table space was specified for that
partition, the index table space will be the same as that for the
partition data.
What indexes will be inherited by the target table?
All the
partitioned (local) indexes defined on the source table will be inherited
by the target table. These include those system generated indexes being
used for primary or unique key constraints, for XML support, and for MDC
table support. Among nonpartitioned (global) indexes, only MDC block
indexes (which are nonpartitioned if the MDC table was created before V97
FP1) and XML column path indexes are inherited by the target table. No
other nonpartitioned indexes will be inherited.
Will any constraints on the source table be inherited by the target
table?
No. Constraints on the source table will not be inherited
by the target table after DETACH. One should be able to add similar
constraints back using the ALTER TABLE command when necessary. If any of
the constraints were enforced with partitioned indexes before the detach,
the ALTER TABLE command will reuse the inherited indexes and avoid
spending time to create new ones.
Do I have to run SET INTEGRITY on the source or target after
DETACH?
No, you do not need to run SET INTEGRITY on either the source
table or the target table after DETACH. It's a common misunderstanding.
When there are refresh immediate MQTs defined on the table, these MQTs
will be in set integrity pending state after detach. You need to use SET
INTEGRITY to refresh these MQTs and bring them to full access state.
How do I know if a partition has been detached successfully?
There are several options:
- Query the catalog view SYSCAT.DATAPARTITIONS to check if the partition is still part of the source table.
- Use
LIST UTILITIES SHOW DETAILto see if the asynchronous partition detach task activity has completed. - Use the stored procedure code provided in this article, which is essentially based on the first method.
Can I detach multiple partitions from the same table in the same
transaction?
Yes, but the best practice is to detach one
partition at a time and commit right away, then wait for the APD task to
complete before issuing the next DETACH. This is
to avoid the potential conflicts caused by multiple APDs, one per DETACH, which may compete for system catalog locks and other resources. The result
of such conflicts is that only one winner can proceed, where others will be
put back on the waiting queue until next retry cycle, which, ultimately,
takes longer than following the best practice.
Are the stats on the partition inherited after DETACH?
Some table and index statistics from the source table will be inherited by the
target table during a detach operation, but not all. We recommend
that users perform RUNSTATS for the source table after a detach operation.
If the target table of a detach will be retained and queried in the
future, we recommend running RUNSTATS for that table as well.
Are there any restrictions for DETACH when there are referential
constraint relationships existing on the tables involved?
Yes. Detaching a partition from a parent table in the RI relationship is not
allowed. One can drop the specific constraints or convert them from
enforced to informational constraints to get around this restriction.
However, caution must be exercised when performing such a workaround so
that data integrity is not compromised.
Will delete triggers be fired during DETACH?
No.
Is there an option to force the old DETACH behavior (i.e., the
completely synchronous detach behavior as in versions prior to DB2 V9.7
FP1)?
No. DB2 doesn't support such an option. However, one can use
the stored procedure (waitForDetach) provided in this article or come up
with one's own script to serialize DETACH and successive operations that
involve the target table.
How soon can I reuse the partition name after the partition is
detached?
The partition name is immediately available for reuse
in the same unit of work following the return of the DETACH command, which
will internally rename the partition to a system-generated name based on
the current time (SQLyymmddhhmmssxxx). If the partition name needs to be
reused by a SQL command from a different unit of work, that command will
have to wait for the commit of the unit of work in which DETACH is issued.
I am using partitioned MQTs in my database. Can I detach a partition
from the MQT?
Yes. The target table will be a regular table. The DETACH behavior
will otherwise be the same as for a non-MQT table.
This article has described concurrency improvement for the ALTER TABLE
DETACH PARTITION statement. As of DB2 V9.7.1, DETACH allows reporting
queries to run concurrently with roll-out using table partitioning,
thereby eliminating the need for downtime when business data retention
policies are being implemented. We also discussed example scenarios in
which asynchronous detach may need to be synchronized and provided
example code for handling such synchronization.
Appendix: waitForDetach procedure
Below is example code for a stored procedure and a related helper
function that can be used to determine when a DETACH operation has fully
completed and the target table is accessible. Please note that this code
is intended for illustrative purposes only.
In the following sample code, the CLP statement termination character is
assumed to be %.
Listing 5. Sample code for
waitForDetach procedure
-- Command to run as a script, assuming code is in a file named 'waitForDetach.clp'
-- would be: db2 -td% -vf waitForDetach.clp
-- Function to check if a table exists in the database.
--
-- Input:
-- schemaName - Table schema
-- tableName - Table name
--
-- Returns: 1 if table is found, 0 if table not found
--
create function tableExists (schemaName varchar(128),
tableName varchar(128))
RETURNS INT
RETURN
case
when exists (
select DATAPARTITIONID from syscat.datapartitions
where tabschema = schemaName and tabname = tableName
WITH UR
)
then 1
else 0
END%
-- Procedure to wait for detach of a single partition or all
-- partitions to complete
--
-- Input:
-- schemaName - Table schema
-- tableName - Table name
-- dataPartID - (optional) dpid of partition to wait for;
-- by default, wait for all detach operations on the table
-- to complete.
--
-- Output:
-- returnCode - 0 indicates detach completed successfully;
-- -1 indicates an error
-- msg - Error details
--
create procedure waitForDetach (OUT returnCode integer,
OUT msg varchar(128),
IN schemaName varchar(128),
IN tableName varchar(128),
IN dataPartID int DEFAULT -1)
specific waitForDetach
language SQL
BEGIN
declare dpid int;
declare dpstate char;
declare done boolean default FALSE;
declare tabNotFound boolean default FALSE;
declare curClosed boolean default FALSE;
declare oldLockTimeout INTEGER;
declare allDetachCheck cursor for
select DATAPARTITIONID, STATUS from syscat.datapartitions
where tabschema = schemaName and tabname = tableName and
(status = 'L' OR status = 'D') WITH CS;
declare oneDetachCheck cursor for
select DATAPARTITIONID, STATUS from syscat.datapartitions
where tabschema = schemaName and tabname = tableName and
DATAPARTITIONID = dataPartID WITH CS;
-- Add signal handlers for errors that you consider non-fatal here
-- example for no row found and lock timeout shown below
declare continue handler for NOT FOUND
BEGIN
-- If no record was found there is nothing to wait for, so we are done
set done = TRUE;
END;
declare continue handler for SQLSTATE '40001'
BEGIN
-- on a lock timeout we are not done, we want to loop again
set done = FALSE;
-- the cursor will be closed as part of the rollback during the lock timeout
set curClosed = TRUE;
END;
-- initialize return code to 0
set returnCode = 0;
-- save current lock timeout value
values current lock timeout into oldLockTimeout;
set current lock timeout -1;
-- if table does not exist in syscat.datapartitions, return error
IF tableExists (schemaName, tableName) = 0
THEN
set msg = 'Table ' || schemaName || '.' || tableName || ' not found';
set returnCode = -1;
GOTO exit;
END IF;
WHILE done = FALSE DO
set curClosed = FALSE;
IF dataPartID <> -1
THEN
open oneDetachCheck;
fetch oneDetachCheck into dpid, dpstate;
-- two cases here:
-- (i) detach has already completed hence partition entry not
-- found in catalogs (indicated by done = TRUE, handled later)
-- (ii) detach in progress, partition state should not be visible
IF done <> TRUE AND (dpstate = '' OR dpstate = 'A')
THEN
set msg = 'Cannot waitForDetach if DETACH was not issued on ' ||
'DATAPARTITIONID ' || dataPartID;
set returnCode = -1;
GOTO exit;
END IF;
IF curClosed <> TRUE
THEN
close oneDetachCheck;
END IF;
ELSE
open allDetachCheck;
fetch allDetachCheck into dpid, dpstate;
IF curClosed <> TRUE
THEN
close allDetachCheck;
END IF;
END IF;
END WHILE;
IF dataPartID <> -1
THEN
set msg = 'DETACH completed on DATAPARTITIONID ' || dataPartID;
ELSE
set msg = 'All detach operations completed on table ' ||
schemaName || '.' || tableName;
END IF;
exit:
-- restore lock timeout value
set current lock timeout oldLockTimeout;
RETURN returnCode;
END%
|
Learn
- "Unleash
the power of table partitioning in your DB2 warehouse" discusses the application of the DB2 for Linux,
UNIX, and Windows Table Partitioning feature in a data warehouse
environment.
- "Best
practices for DB2 for Linux, UNIX, and Windows"
comprises the regularly updated set of
best-practice papers for DB2 for Linux, UNIX, and Windows.
- Learn more about Information Management at the developerWorks Information Management
zone. Find technical documentation,
how-to articles, education, downloads, product information, and
more.
- Stay current with
developerWorks technical events and webcasts.
- Follow developerWorks on
Twitter.
Get products and technologies
- Build your next
development project with
IBM trial software,
available for download directly from developerWorks.
Discuss
- Participate in the discussion forum.
- Check out the
developerWorks
blogs and get involved in the
developerWorks community.

Naresh Chainani is a software developer and has worked in DB2 for Linux, UNIX, and Windows development in IBM Beaverton for the past 10 years. His expertise spans several areas of DB2 with a particular emphasis on table partitioning and decimal floating-point. Recently, he led the development of several enhancements to the table partitioning feature in DB2. Prior to that, he led the development of the IEEE-compliant DECFLOAT data type. He frequently interacts with customers either to present table partitioning or provide consulting assistance related to table partitioning, as well as to seek valuable stakeholder feedback.

Liping Zhang is a Senior Software Engineer based at the IBM Beaverton Lab. He has been working on the table partitioning in DB2 for Linux, UNIX, and Windows since version 9.1 and has led the feature design since version 9.7. He has also been working with customers to help them understand and adopt the table partitioning feature. Liping has authored IDUG presentations and frequently contributed to DB2 papers, articles and books on table partitioning.






