Much of the power of the relational model comes from the fact that the relational table is the single data structure that represents homogeneous collections of data. Tables are a convenient way to represent both persistent data and query results.
However, there are circumstances where using tables to represent collections results in bulky, and possibly inefficient, code. Think, for example, of an application that calls a stored procedure with the list of items that a customer has just purchased. Without a collection type such as arrays, the application would need to store the list of items on a table prior to calling the stored procedure, have the stored procedure read the list by selecting from the table and then delete the list.
To avoid the overhead associated with using a table, the application could encode the list of items in some string format and have the procedure decode the string to extract the items. Compared to simply passing on the list represented as an array, both workarounds are cumbersome and inefficient.
DB2 9.5 introduces support for array data types in SQL PL to help with application development issues like those described above. Given that arrays are a staple in all major programming languages, the concept of array doesn't need much explanation. Instead, take a look at a simple example that illustrates arrays in SQL PL. Listing 1 shows an array data type and a procedure that uses it.
Listing 1. An array data type and a procedure that uses it
1. CREATE TYPE phonenumbers AS VARCHAR(12) ARRAY[1000] 2. -- Procedure find_customers searches for numbers in 3. -- numbers_in that begin with the given area_code, 4. -- and reports them in numbers_out. 5. -- Phone numbers are strings of the form 416-413-9394 6. CREATE PROCEDURE find_customers( 7. IN numbers_in phonenumbers, 8. IN area_code CHAR(3), 9. OUT numbers_out phonenumbers) 10. BEGIN 11. DECLARE i, j, max INTEGER; 12. SET i = 1; 13. SET j = 1; 14. SET numbers_out = NULL; 15. SET max = CARDINALITY(numbers_in); 16. WHILE i <= max DO 17. IF substr(numbers_in[i], 1, 3) = area_code THEN 18. SET numbers_out[j] = numbers_in[i]; 19. SET j = j + 1; 20. END IF; 21. SET i = i + 1; 22. END WHILE; 23. END |
Line 1 in Listing 1 shows a create type statement that creates a
type named phonenumbers. This type has a
maximum cardinality of 1000 -- meaning that values of this type
can have between 0 and 1000 elements, and its element type is
VARCHAR(12). Procedure find_customers, takes two parameters of type
phonenumbers (one IN and one OUT), and
populates the OUT parameter with the input numbers that begin with a given
area code. As seen in the example, array sub-indexing is performed using
the familiar square bracket syntax, and the sub-index of the first element
is 1. Function CARDINALITY, used in line 15,
returns the number of elements in the input array. The
WHILE loop scans the input array; when an
element begins with the required area code, it is added to the output
array.
The maximum cardinality of an array type is used by DB2 at runtime to
ensure that subscripts are within bounds. In this example, DB2 alerts you
to a runtime error when a sub-index used on a variable of type
phoneList is less than 1 or greater than 1000.
If, when defining an array type, there is no clear value for its maximum
cardinality, it can be omitted. This step is shown in the listing
below:
CREATE TYPE unboundedPhoneList AS VARCHAR(12) ARRAY[]; |
When the maximum cardinality in a type definition is omitted, the sub-indexes on values of that type can range from 1 to the maximum positive value of integer type. However, the memory allocated for an array is based on its cardinality, and not on the maximum cardinality of its type. Therefore, omitting the maximum cardinality does not have any impact on the amount of allocated memory. Memory management is discussed in greater detail in a later section of this article.
In the example of Listing 1, new elements are
always appended at the end, and therefore
numbers_out's cardinality increases by 1 each
time an element is added as shown in line 18. In general, however, the
elements of an array can be set in any order.
When an array variable is NULL or when the
element being assigned is in a position beyond the current cardinality,
all the elements between the old last element and the new one are
implicitly assigned the NULL value. For
example, assume that variable phoneList, of
type phonenumbers is currently
NULL. After executing the statement
SET phoneList[10] = '416-413-9394'; |
the cardinality of phoneList is 10, and elements
in positions 1 to 9 are NULL. Elements in
positions 11 and up do not exist (note that this is not the same as being
NULL). Therefore, a statement such as
SET phone = phoneList[11]; |
creates a runtime error, as it refers to a non-existing element.
If you now executed the statement
SET phoneList[12] = '416-413-7727'; |
the cardinality of phoneList would be 12, and
the statement
SET phone = phoneList[11]; |
would set variable phone to NULL.
It is not necessary to initialize array elements one at a time, as in Listing 1. When the values of all the elements in the array are known, an array can be created in a single operation with the help of array constructors. There are two types of array constructors:
- Constructor by enumeration
- Constructor by query
Constructors by enumeration allow you to specify the values of all the elements as comma-separated scalar values, as shown in the following two listings. The values in the following listing are all literals, which is the most common case when initializing arrays. In general, each value in the enumeration can be an arbitrary scalar SQL expression.
SET phoneList = ARRAY['416-413-9394', '416-413-7727', '416-413-6254']; |
Listing 2 shows an array constructor with four elements with values given
by the NULL constant, a variable, a literal and
a scalar sub-query, respectively.
Listing 2. Array constructor with four elements
SET phoneList = ARRAY[NULL, phone, '416-413-7727',
(SELECT phone
FROM authors
WHERE author_id = 100)];
|
An array constructor with no elements represents an empty array (that is,
an array with cardinality equal to 0). Empty arrays should not be confused
with NULL arrays or with arrays containing just
NULL values. Each of the statements in Listing 3 sets variable phoneList to a different value.
Please pay careful attention to the comment beside each statement.
Listing 3. NULL and empty array values
SET phoneList = NULL; /* NULL value. */ SET phoneList = ARRAY[]; /* empty array */ SET phoneList = ARRAY[NULL]; /* array with one NULL element */ |
Another way of building an array in a single operation is by using a constructor by query, which creates an array with the values returned by a single-column sub-select, as shown in the listing below:
SET phoneList = ARRAY[SELECT phone FROM authors WHERE city = 'Toronto']; |
If the sub-select returns an empty table, the result of the constructor is the empty array. And, as expected, when the number of elements specified in an array constructor exceeds the array's maximum cardinality, DB2 notifies you of an error.
Arrays in SQL PL are allocated and reallocated dynamically. When the value
of an array variable is NULL, there is no
memory allocated for it. Contrast this to arrays in C where space for the
whole array is allocated up front.
When an array variable is assigned an initial value, DB2 allocates enough memory to hold that value. Later, If the variable is assigned a new value that doesn't fit in the allocated memory, DB2 implicitly performs a reallocation.
The amount of memory allocated for a variable depends on a few factors. If the maximum possible size of the array (given its type) is not deemed too big, DB2 allocates the space for maximum size the first time a value is assigned to a variable. This guarantees that no reallocation is necessary regardless of how many elements are added to the array later on and regardless of the size of those elements.
How big is too big is something that DB2 decides based on the availability of free memory. If the maximum possible size of the array is deemed too big, DB2 only allocates enough space to hold the initial value, and possibly extra space to allow for some growth.
In the examples so far, you have seen different ways of putting arrays
together. SQL PL also provides the TRIM_ARRAY
function, to remove elements from arrays. The following listing shows
TRIM_ARRAY being used to remove the last two
elements from phoneList:
SET phoneList = TRIM_ARRAY(phoneList, 2); |
The value of the second argument to TRIM_ARRAY
must be a value between 1 and the cardinality of the first argument.
At the beginning of this article, I argued that the main motivation behind
array support in SQL PL is to provide a simpler alternative to relational
tables for representing relatively small collections. It is often the
case, however, that the initial value for an array is obtained by
extracting values from a table. Similarly, very often, the contents of an
array needs to be turned into rows in a table, either to be stored or to
be joined with other tables. To make such conversions between arrays and
tables possible, DB2 provides the new ARRAY_AGG
function and UNNEST operator.
The ARRAY_AGG function is a column function that
aggregates values into an array.
Suppose you have this table:
Table 1. Customer table
| CUST_ID | NAME | PHONE | LOCATION |
|---|---|---|---|
| 17 | Joe Cat | 416-305-3745 | Toronto |
| 113 | Tom Dog | 905-305-3747 | Ajax |
| 716 | Sam Bear | 416-305-3746 | Toronto |
| 5 | Jill Bird | 905-723-1662 | Markham |
| 221 | Kim Frog | 416-478-9683 | Toronto |
Assuming you have the customers table shown in Table 1, the statement below sets variable phones to an array with all the phone numbers of customers in Toronto:
SET phones = (SELECT ARRAY_AGG(phone) FROM customers WHERE location = 'Toronto'); |
The above statement generates the phones in an arbitrary order. If the
order is relevant, an ORDER BY clause can be
specified in ARRAY_AGG, as illustrated in the
listing below:
SET phones = (SELECT ARRAY_AGG(phone ORDER BY name) FROM customers WHERE location = 'Toronto'); |
Note that an ORDER BY clause in the
SELECT statement would not have any effect on
the order of the array elements, as it would apply to the rows returned by
the statement. In this example, there is only one such row.
Simple uses of ARRAY_AGG can be expressed
equivalently using array constructors by query. However, the
ARRAY_AGG function can also be used in
statements that return more than one row and/or more than one column.
For example, the statement in Listing 4 defines a cursor that creates arrays of phones and IDs for customers in each city. The result of the statement is shown in Table 2.
Listing 4. Statement to create arrays of phones and IDs for customers in each city
DECLARE myCursor CURSOR FOR
SELECT location,
ARRAY_AGG(phone ORDER BY name) AS PHONES,
ARRAY_AGG(cust_id) AS IDS
FROM customers
GROUP BY location
|
Table 2 shows the result of the statement in Listing 4.
Table 2. Statement results
| LOCATION | PHONE | IDs |
|---|---|---|
| Toronto | ||
| 1: 416-305-3745 | 1: 17 | |
| 2: 416-305-3746 | 2: 716 | |
| 3: 416-305-3747 | 3: 221 | |
| Ajax | 1: 905-305-3747 | 1: 113 |
| Markham | 1: 905-723-1662 | 1: 5 |
Once a collection of data has been computed, it is often necessary to use
this collection in SQL statements, for example to insert the data into a
table or to join it with data stored in tables. The
UNNEST operator in SQL PL makes such an
operation possible.
In its simplest form, UNNEST takes an array and
produces a one-column table. The statement in Listing 5 selects the
customer IDs for customers with a phone number in phoneList.
Listing 5. Selecting customer IDs
SELECT cust_id FROM customers, UNNEST(phoneList) as T(phone) WHERE customers.phone = T.phone |
In its most general form, UNNEST can take
several arrays as arguments, and it can produce an additional column to
reflect the order imposed by the sub-indexes. For the query in Listing 6,
assume that productIds and
productQs are two arrays containing the ids and
quantities for the line items in an order. The line item number is
reflected in the array sub-index. The UNNEST in
Listing 6 creates a three-column table which is inserted into the
order_items table. The
WITH ORDINALITY clause causes
UNNEST to generate an additional column of type
integer that contains the position associated with each element.
Listing 6. WITH ORDINALITY clause
INSERT INTO order_items
(SELECT orderId, T.index, T.pid, T.pq
FROM UNNEST(productIds, productQs)
WITH ORDINALITY as T(pid, pq, index)
|
To wrap up the coverage of conversion between arrays and tables, revisit
procedure phonenumbers from the first example.
The version presented in Listing 1 uses a loop to
process the input array and populate the output array. However, it turns
out that all the logic in the procedure can be expressed as a single SQL
statement that uses UNNEST and
ARRAY_AGG, as shown in Listing 7:
Listing 7. Procedure find_customers revisited
1. CREATE TYPE phonenumbers AS VARCHAR(12) ARRAY[1000] 2. -- Procedure find_customers searches for numbers in 3. -- numbers_in that begin with the given area_code, 4. -- and reports them in numbers_out. 5. -- Phone numbers are strings of the form 416-413-9394 6. 7. CREATE PROCEDURE find_customers( 8. IN numbers_in phonenumbers, 9. IN area_code CHAR(3), 10. OUT numbers_out phonenumbers) 11. BEGIN 12. SET numbers_out = 13. (SELECT ARRAY_AGG(T.num) 14. FROM UNNEST(numbers_in) AS T(num) 15. WHERE substr(T.num, 1, 3) = area_code); 16. END |
The last example proves that while arrays allowed us to pass a collection of values into a stored procedure efficiently and concisely, once inside the procedure, it may still be beneficial to turn the array into a table to exploit the powerful set-at-a-time semantics of SQL to process the array elements.
Using arrays on the client side
The examples so far demonstrate arrays in SQL PL. On the client side,
arrays are supported in Java Database Connectivity (JDBC), command line
interface (CLI) and in the Command Line Processor. Procedures with array
parameters can be called from any of these interfaces. Listing 8 shows a
CALL statement issued from the CLP to the
procedure find_customers defined earlier, and
the output generated by the CLP.
Listing 8. Calling procedures with array parameters from the CLP
db2 CALL find_customers(ARRAY['416-305-3745', '905-414-4565', '416-305-3746'], '416', ?)
Value of output parameters
--------------------------
Parameter Name : NUMBERS_OUT
Parameter Value : [416-305-3745,
416-305-3746]
Return Status = 0
|
Listing 9 shows a Java snippet that illustrates how to call procedure
find_customers from a JDBC application.
Listing 9. Calling procedure find_customers from a JDBC application
import java.sql.*;
public class array_sample
{
public static void main(String argv[])
{
com.ibm.db2.jcc.DB2Connection con = null;
try
{
// Connect to the db
Class.forName("com.ibm.db2.jcc.DB2Driver").newInstance();
con = (com.ibm.db2.jcc.DB2Connection)
DriverManager.getConnection("jdbc:db2:test");
// Create an array of strings
String[] phones=new String[10];
phones[0]="416-305-3745";
phones[1]="905-414-4565";
phones[2]="416-305-3746";
// Create an SQL ARRAY value of type "ARRAY of VARCHAR"
java.sql.Array phonesArrayIn =con.createArrayOf("VARCHAR", phones);
// Prepare the call statement
CallableStatement callStmt =
con.prepareCall("CALL find_customers(?, ?, ?)");
// Set IN parameters
callStmt.setArray(1, phonesArrayIn);
callStmt.setString(2, "416");
// Register OUT parameter
callStmt.registerOutParameter(3, java.sql.Types.ARRAY);
// Call the procedure
callStmt.execute();
// Get value of OUT parameter
java.sql.Array phonesArrayOut = callStmt.getArray(3);
String[] phonesOut = (String [])phonesArrayOut.getArray();
// Print result
System.out.println("Result:");
for(int i = 0; i < phonesOut.length; i++)
{
System.out.println(phonesOut[i]);
}
con.close();
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
|
Listing 10 shows a DB2 CLI snippet that illustrates how to call procedure find_customers from a DB2 CLI application.
Listing 10. Calling procedure find_customers from a CLI application
#include <stdio.h>
#include <stdlib.h>
#include <string>
#include <sqlcli1.h>
#include <sql.h>
#include <sqlenv.h>
int rc = 0;
int main(int argc, char *argv[])
{
SQLHANDLE henv; /* environment handle */
SQLHANDLE hdbc; /* connection handle */
SQLHSTMT stmt; /* statement handle*/
char dbAlias[SQL_MAX_DSN_LENGTH + 1];
char user[10];
char pswd[10];
if (argc == 1)
{
strcpy(dbAlias ,"TEST");
strcpy(user, "");
strcpy(pswd, "");
}
else
{
printf("Incorrect usage \n");
return -1;
}
printf("Connecting to %s\n",dbAlias);
SQLRETURN cliRC = SQL_SUCCESS;
/* allocate an environment handle */
cliRC = SQLAllocHandle(SQL_HANDLE_ENV, SQL_NULL_HANDLE, &henv);
if (cliRC != SQL_SUCCESS)
{
printf("\n--ERROR while allocating the environment handle.\n");
return 1;
}
/* allocate a database connection handle */
cliRC = SQLAllocHandle(SQL_HANDLE_DBC, henv, &hdbc);
if (cliRC != SQL_SUCCESS)
{
printf("\n--ERROR while allocating the database handle.\n");
return 1;
}
printf("\n Connecting to %s...", dbAlias);
/* connect to the database */
cliRC = SQLConnect(hdbc,
(SQLCHAR *)dbAlias,
SQL_NTS,
(SQLCHAR *)user,
SQL_NTS,
(SQLCHAR *)pswd,
SQL_NTS);
printf(" Connected to %s.\n", dbAlias);
/* allocate a statement handle */
cliRC = SQLAllocHandle(SQL_HANDLE_STMT, hdbc, &stmt);
if (cliRC != SQL_SUCCESS)
{
printf("\n--ERROR while allocating the statement handle.\n");
return 1;
}
int i=0;
// Declare local C variables to store the input and output to the stored procedures
char numbers_in[1000][13]={0};
char numbers_out[1000][13];
char area_code[3];
//declare IPD/APD attributes
SQLHANDLE hIPD;
SQLHANDLE hAPD;
// Define the size of the arrays
SQLLEN size = 1000;
sprintf(area_code,"416");
printf("%s\n",area_code);
sprintf(numbers_in[0],"416-305-3745");
sprintf(numbers_in[1],"905-414-4565");
sprintf(numbers_in[2],"416-305-3746");
//Bind input/output parameters
if (SQLBindParameter(stmt, ++i, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_VARCHAR, 12, 0, &numbers_in, 13, NULL)!=SQL_SUCCESS ||
SQLBindParameter(stmt, ++i, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, 3, 0, area_code, 3, NULL)!=SQL_SUCCESS ||
SQLBindParameter(stmt, ++i, SQL_PARAM_OUTPUT, SQL_C_CHAR,
SQL_VARCHAR, 12, 0, &numbers_out, 13, NULL)!=SQL_SUCCESS)
{
printf ("Binding unsuccessful \n");
return 1;
}
//GetStmtAttr for IPD/APDs
if(SQLGetStmtAttr(stmt,SQL_ATTR_IMP_PARAM_DESC,&hIPD,SQL_IS_INTEGER,NULL)!=SQL_SUCCESS ||
SQLGetStmtAttr(stmt,SQL_ATTR_APP_PARAM_DESC,&hAPD,SQL_IS_INTEGER,NULL)!=SQL_SUCCESS)
{
printf ("GetStmtAttr unsuccessful\n");
return 1;
}
// For input array parameters, define the SQL_DESC_CARDINALITY attribute of IPD
if( SQLSetDescField(hIPD,1, SQL_DESC_CARDINALITY,(SQLPOINTER)1000,SQL_IS_INTEGER)
!=SQL_SUCCESS ||
// For output array parameters, define the SQL_DESC_CARDINALITY field for APD
SQLSetDescField(hAPD,3, SQL_DESC_CARDINALITY,(SQLPOINTER)1000, SQL_IS_INTEGER)
!=SQL_SUCCESS ||
// For all array parameters, define the SQL_DESC_CARDINALITY_PTR field for APD
SQLSetDescField(hAPD,1,SQL_DESC_CARDINALITY_PTR,&size,SQL_IS_INTEGER)
!=SQL_SUCCESS ||
SQLSetDescField(hAPD,3, SQL_DESC_CARDINALITY_PTR, &size, SQL_IS_INTEGER)
!= SQL_SUCCESS)
{
printf ("SQLSetDescField unsuccessful\n");
return 1;
}
if (SQLPrepare(stmt, (SQLCHAR*)"CALL find_customers(?, ?, ?)", SQL_NTS)!= SQL_SUCCESS)
{
printf ("SQLPrepare unsuccessful\n");
return 1;
}
rc = SQLExecute(stmt);
printf ("SQLExec: rc %d %d \n",rc,SQL_SUCCESS );
cliRC = SQLFreeHandle(SQL_HANDLE_STMT,stmt);
/* free the statement handle */
cliRC = SQLFreeStmt(stmt, SQL_CLOSE);
return 0;
}
|
Comparing DB2's arrays with Oracle's VARRAYs
Developers migrating applications from Oracle to DB2 will find the new array support in DB2 particularly useful, as arrays are commonly used in Oracle PL/SQL. This appendix outlines the main differences between the ARRAY data type in DB2 and Oracle's VARRAY, and how the most common operations on VARRAYs can be mapped to ARRAYs.
In DB2, variables of type array are like variables of any other SQL data
types. And, like with other data types, DB2 provides some functions that
take arrays as parameters and/or return arrays as results, such as
CARDINALITY and
TRIM_ARRAY.
In Oracle, VARRAY is an object type. VARRAYs are manipulated using methods,
instead of functions. Listing 11 illustrates this difference between DB2
and Oracle with a statement that trims the last two elements from
myArray and a statement that saves
myArray's cardinality into
myCard.
Listing 11. Trimming array elements in DB2 and in Oracle
DB2: myArray = TRIM_ARRAY(myArray, 2); myCard = CARDINALITY(myArray); Oracle: myArray.TRIM(2); myCard = myArray. COUNT; |
As you saw in Section 1, DB2 implicitly allocates memory for arrays when
they are initialized with a non-NULL value or
when they grow past their current size. On the other hand, Oracle requires
explicit allocation operations prior to adding elements to an array.
Listing 12 illustrates this difference.
Listing 12. Appending elements in DB2 and in Oracle
DB2: myArray[CARDINALITY(myArray) + 1] = 1000; Oracle myArray.EXTEND; myArray(myArray,COUNT + 1) = 1000; |
Listing 12 also illustrates the different syntax for sub-indexing in DB2 (square brackets) and in Oracle (parentheses).
Oracle applications involving arrays often make use of bulk bind
operations, which execute SQL DML statements repeatedly, for each of the
elements in an array. In Oracle, these operations are
FORALL and
BULK COLLECT. DB2 doesn't provide direct
equivalents to these operations, but the most common uses of
FORALL and
BULK COLLECT can be efficiently mapped to
statements involving UNNEST and
ARRAY_AGG, respectively.
The FORALL statement repeats another statement
for each of the elements in an array. Listing 13 shows an example of a
FORALL statement that repeats an INSERT for
each of the elements in arrays departments.
Listing 13. FORALL statement
FORALL i IN depts.FIRST..depts.LAST INSERT INTO emp(deptno) values(depts(i)); |
Listing 14 shows the DB2 equivalent using
UNNEST.
Listing 14. Mapping FORALL to UNNEST
INSERT INTO emp (deptno)
(SELECT T.deptno FROM UNNEST(depts))
|
Let's now see how Oracle's BULK COLLECT can be
mapped to ARRAY_AGG.
BULK COLLECT is not a statement on its own, but
an optional clause that can be applied to
SELECT INTO and
FETCH statements.
An Oracle SELECT INTO statement that includes a
BULK COLLECT clause is similar to a
SELECT with
ARRAY_AGG in DB2. Regular
SELECT INTO statements can return only one row.
When the BULK COLLECT clause is included, the
statement can return multiple rows; the variable in the
INTO clause must be an array, and each row is
stored in an array element.
Listing 15 illustrates the use of
BULK COLLECT.
Listing 15. BULK COLLECT clause
DECLARE names NameList; BEGIN SELECT ename BULK COLLECT INTO names FROM emp WHERE sal > 1000; END; |
Listing 16 shows the DB2 equivalent.
Listing 16. Mapping BULK COLLECT to ARRAY_AGG
BEGIN DECLARE names NameList; SET names = (SELECT ARRAY_AGG(ename) FROM emp WHERE sal > 1000); END; |
Similarly, when a FETCH statement includes the
BULK COLLECT clause, the variable used in the
FETCH must be an array. ORACLE implicitly
repeats the FETCH until all the rows from the
cursor are consumed and stores each row in an element of the array.
In addition to the differences mentioned so far, Oracle supports storing VARRAYs as a column type, and supports the creation of arrays of records and arrays of arrays. These features are not yet supported in DB2.
To measure the performance impact of arrays, a 900GB database was used and an OLTP (Online Transactional Processing) workload was exercised. The workload consisted of approximately 60% select statements and 40% insert, update, and delete statements. Stored procedures and a DB2 CLI application was used to drive the workload. Two distinct sets of stored procedures were implemented.
For the first set of stored procedures, DB2 arrays were used when possible. In the second set of stored procedures, strings were used in place of arrays, and cursors were heavily used to fetch rows from multi-row result sets. For each row retrieved from the cursor call, the element was appended to the string along with a delimiter. A user-defined function was used to break up the string into individual elements when necessary. Listing 17 shows an example of a CALL statement issued from the CLP to procedure alternate_find_customers and the output generated by the CLP. A DB2 CLI application was implemented for each version of the stored procedure.
Listing 17. Calling procedures defined using varchars instead of arrays from the CLP
db2 CALL find_customers(ARRAY['416-305-3745;905-414-4565;416-305-3746', '416', ?) Value of output parameters -------------------------- Parameter Name : NUMBERS_OUT Parameter Value : 416-305-3745;416-305-3746 Return Status = 0 |
DB2 array data types yielded significantly higher performance compared to varchar representations of arrays. We measured a ~70% improvement when taking both throughput and user CPU utilization into consideration (normalized throughput in the chart in Figure 1 below). Most of the performance improvement stems from the simplification of the stored procedures and DB2 CLI application code, namely the removal of cursor definitions and accesses, as well as removal of extra local variables and user-defined functions that are not needed when using DB2 arrays. This results in a reduction in user CPU utilization, which translates into throughput gains.
Figure 1. Normalized throughput
This article presents an overview of the recently introduced support for array data type in DB2 V9.5. Like arrays in other programming languages, DB2 arrays can be used as an auxiliary data structure when coding algorithms in SQL PL. But more importantly, arrays are a very convenient representation for collections of data that flow between an application and stored procedures or between stored procedures.
In addition to the basic operations on arrays (such as sub-indexing, cardinality and trimming), DB2 has integrated arrays into the relational model in such a way that a result set can be turned into an array (for example, when it needs to be passed as a parameter to another stored procedure), and an array can be turned into a table. The latter conversion allows the data in the array to be joined with data in other tables, and allows the set-at-a-time semantics of SQL to be applied to collections represented as arrays.
JDBC and DB2 CLI array support is also available, and there is evidence to suggest that there can be performance gains if stored procedures and applications are updated to use the new DB2 array data types.
Thanks to Serge Rielau for reviewing an early draft of this article.
Learn
- developerWorks Information
Management zone: Learn more about Information Management. Find
technical documentation, how-to articles, education, downloads, product
information, and more.
- Stay current with developerWorks technical events
and webcasts.
- Technology bookstore: Browse for books on these and other
technical topics.
Get products and technologies
- Download a trial
version of DB2 for Linux, UNIX, and Windows to test it for yourself.
- Build your next
development project with IBM trial software, available
for download directly from developerWorks.
Discuss
- Participate in developerWorks blogs and get
involved in the developerWorks community.

Gustavo Arocena is a Senior Developer in the DB2 SQL Compiler area. He joined the IBM Toronto Lab in 1998 and is currently responsible for SQL Procedures and SQL Parser development. Gustavo holds a Master's degree in Computer Science from the University of Toronto in the area of database query languages.

Nailah Ogeer Bissoon, M.Sc. is currently a member of the DB2 Performance Quality Assurance team. Nailah joined the IBM Toronto Lab in 2004 and has worked on several DB2 performance benchmark publications since then. Her tasks also include assisting customers in performance-sensitive areas and ensuring that DB2 continues to be a performance leader among other Database Management Systems.




