In my last column, I explained how you can use table partitioning to organize data
across multiple storage objects based on values in one or more columns. I also
described two ways to use the
PARTITION BY clause of the
TABLE statement to create partitioned tables.
One advantage to using partitioned tables is that only relevant data partitions are accessed during query execution; because the DB2 Optimizer is data-partition aware, only relevant data partitions are scanned to resolve a query, resulting in fewer I/Os and higher query performance. Another advantage is that new partitions can easily be added to increase the table range or existing tables containing data can easily be attached (rolled in), while partitions containing old or obsolete data can be removed (rolled out) and archived, for example, to meet company policies or federal laws that mandate record retention.
In this column, I'll walk you through the process of adding and removing data partitions, and I'll show you how enhancements made in DB2 9.7 can make the process easier and faster.
Adding new partitions to a partitioned table
There are two ways to add new partitions to a partitioned table: you can add new ranges and increase the capacity of a table by adding one or more empty partitions, or you can add existing, populated tables by converting them into new partitions. As you might imagine, the process used depends on which approach you choose.
A new, empty partition can be added to a partitioned table by executing the
ALTER TABLE statement with the
ADD PARTITION option
specified; the syntax used looks like this:
ALTER TABLE [TableName] ADD PARTITION <PartitionName> <STARTING <FROM> [Start | MINVALUE | MAXVALUE] | STARTING <FROM> ([Start | MINVALUE | MAXVALUE] ,...) <INCLUSIVE | EXCLUSIVE>> ENDING <AT> [End | MINVALUE | MAXVALUE] | ENDING <AT> ([End | MINVALUE | MAXVALUE] ,...) <INCLUSIVE | EXCLUSIVE> <IN [TSName]> <INDEX IN [IndexTSName]> =<LONG IN [LongTSName]>
TableName:Identifies the partitioned table, by name, that a new partition is to be added to
PartitionName:Identifies the unique name, if any, to be assigned to the partition to be added
Start:Specifies the low end of the range for the data partition
End:Specifies the high end of the range for the data partition
TSName:Identifies the table space in which the new partition is to be stored
IndexTSName:Identifies the table space in which any partitioned indexes for the new partition are to be stored
LongTSName:Identifies the table space in which the values of any long columns are to be stored
Note: Parameters shown in angle brackets (< >) are optional; parameters or options shown in normal brackets ([ ]) are required and must be provided. A comma followed by ellipses (...) indicates that the preceding parameter can be repeated multiple times.
Thus, if you wanted to add an empty partition to a partitioned table named SALES, you
could do so by executing an
ALTER TABLE statement similar to this:
ALTER TABLE sales ADD PARTITION q4_sales STARTING '10/1/2010' ENDING '12/31/2010' IN tbsp3;
When such a statement is executed, a new table is created and logically made a part
of the partitioned table. The system catalog table
SYSCAT.DATAPARTITIONS is then updated to reflect the change.
Attaching a populated table to a partitioned table
Tables that already contain data can be added to a partitioned table as new
partitions by executing the
ALTER TABLE statement with the
PARTITION option specified. The syntax used for this type of operation
looks like this:
ALTER TABLE [TableName] ATTACH PARTITION <PartitionName> <STARTING <FROM> [Start | MINVALUE | MAXVALUE] | STARTING <FROM> ([Start | MINVALUE | MAXVALUE] ,...) <INCLUSIVE | EXCLUSIVE>> ENDING <AT> [End | MINVALUE | MAXVALUE] | ENDING <AT> ([End | MINVALUE | MAXVALUE] ,...) <INCLUSIVE | EXCLUSIVE> FROM [SourceTable] <BUILD MISSING INDEXES | REQUIRE MATCHING INDEXES>
End are the same as before and
the base table that is to be used as the source of data for the new partition.
So, if you wanted to attach a base table named Q4_2010_SALES to an existing
partitioned table named SALES as a new partition, you could do so by executing an
ALTER TABLE statement that looks like this:
ALTER TABLE sales ATTACH PARTITION q4_sales STARTING '10/1/2010' ENDING '12/31/2010' FROM q4_2010_sales;
After a base table is successfully attached to a partitioned table, the system catalog is updated to reflect the newly attached partition while information about the original base table is removed from the catalog. No data movement is involved. This means that the process of attaching a new partition takes place very quickly—typically within a few seconds. (It's important to note that to attach an existing base table to a partitioned table, the base table's design must be similar to that of the partitioned table; see sidebar, "Conditions for attaching a table as a new partition.")
Anytime a base table is attached to a partitioned table, the partitioned table is
automatically placed in "Set Integrity Pending" state. This means that integrity
checking must be performed immediately after a new partition is attached. To check
integrity after adding a base table named Q4_2010_SALES to an existing partitioned
table named SALES (as a new partition), you would need to execute a
INTEGRITY statement that looks similar to this:
SET INTEGRITY FOR sales ALLOW WRITE ACCESS IMMEDIATE CHECKED;
Conditions for attaching a table as a new partition
Before an existing base table can be added (attached) to a partitioned table, the following conditions must be met:
- The source table must be a nonpartitioned table or a partitioned table that has only one partition and no attached or detached partitions.
- The source table cannot be a typed table or a range-clustered table.
- The source table must be droppable.
- The source and target table definitions must match:
- The number of columns must be the same.
- The data types of each column must be the same. (For columns with
INLINE LENGTHvalues must be the same.)
- The nullability characteristics of each column must be the same.
- Any default constraints defined for columns must be the same.
- Source and target table compression specifications must match.
- Characteristics (table space type, page size, extent size) of the table spaces used by the source and target tables must match.
- If the target table is distributed across database partitions, the source table must also be distributed using the same distribution method.
- If the target table is a multidimensional clustering (MDC) table, the source table must also be an MDC table.
- XML data formats used by the source and target tables must be the same.
Just as you can add or attach new partitions to a partitioned table, you can also
remove existing partitions. (Removed partitions become regular, stand-alone base
tables.) Partitions can be removed by executing the
statement with the
DETACH PARTITION option specified; the syntax for
this statement looks like this:
ALTER TABLE [TableName] DETACH PARTITION <PartitionName> INTO [TargetTable]
TableName:Identifies the partitioned table, by name, that a partition is to be removed from
PartitionName:Identifies the partition, by name, that is to be removed
TargetTable:Identifies the base table where the partition's data is to be stored
Thus, if you wanted to remove a partition named Q1_SALES from a partitioned table
named SALES and store its data in a table named Q1_2010_SALES, you could do so by
ALTER TABLE statement that looks like this:
ALTER TABLE sales DETACH PARTITION q1_sales INTO q1_2010_sales;
Once the table Q1_2010_SALES becomes available, you can either drop it (to delete the data) or attach it to another partitioned table (for example, to archive it).
DB2 9.7 table partitioning enhancements
Prior to DB2 9.7, indexes for partitioned tables were nonpartitioned. This meant that whenever partitions were added or removed, corresponding records had to be added to or removed from any indexes that existed for the table.
With DB2 9.7, partitioned indexes are now supported, allowing each data partition to
be indexed separately. This provides several benefits, including improved
performance when adding or removing partitions and the ability to perform select
maintenance operations at the partition level. Starting with FixPack1, you can
REORG operations on a single data partition rather than on the
entire partitioned table, resulting in faster
REORG operations while
increasing availability to other data partitions.
Other table partitioning enhancements introduced with DB2 9.7 include XML support
(available with FixPack1 or greater) and higher availability during
TABLE ... DETACH PARTITION operations—the removal of a partition no
longer requires that all current transactions to the table be completed, allowing
long-running queries to continue without impeding the partition removal process.
Partitioning for performance
Data partitioning provides many useful benefits, particularly when used in data warehouse and decision support environments. By taking advantage of these features, you can manage data more effectively, while delivering optimum database performance.
- Unleash the power of table partitioning in your DB2 warehouse
- DB2 9 table partitioning
- IBM Redbooks: Database partitioning, table partitioning, and MDC for DB2 9
- DB2 for Linux, UNIX, and Windows area on developerWorks
- IBM DB2 9.7 for Linux, UNIX, and Windows Information Center