- Setting up the example
- A simple recursive query
- LEVEL pseudo column
- The CONNECT_BY_ROOT expression
- The SYS_CONNECT_BY_PATH() procedure
- ORDER SIBLINGS BY expression
- The NOCYCLE keyword
- Downloadable resources
- Related topics
Port CONNECT BY to DB2
A recipe to map recursive queries from Oracle to DB2 using recursive common table expressions
When porting or migrating an application from Oracle to DB2 UDB for Linux, UNIX, and Windows, recursive queries present a significant obstacle. A developer who understands and uses Oracle-style recursion often does not naturally understand DB2 recursion and the same is true the other way around.
On first blush, the reason appears to be that DB2 implements recursion as it is defined in the SQL standard using common table expressions (CTE) and UNION ALL. Oracle, on the other hand, uses a CONNECT BY clause and a set of so-called pseudo columns and system procedures to define recursion. The difference, however, goes beyond syntax. The underlying problem is that DB2 processes recursion by essentially producing one level of recursion at a time (breadth first), while Oracle uses depth first. The later approach naturally produces an output that matches an org chart.
In this article, I use a running example that matches CONNECT BY and the related pseudo columns to DB2 recursion, one step at a time.
To perform some of the queries, you must download and expand the CONNECT_BY.zip file available in the Download section at the end of this article.
If you are working on a Windows 32-bit, a Linux on Intel 32-bit, or an AIX® 64-bit system, all you need to do is copy the connect_by (connect_by.dll on Windows) executable found in the appropriate subdirectory of the CONNECT_BY.zip file into your sqllib/function directory.
If you are working on any other operating system or hardware platform with an installed C compiler, you can copy the files from the Connect_By/FunctionSource directory into sqllib/samples/c. After changing to that directory, simply run bldrtn connect_by. This command will compile the provided C-source code. bldrtn will also copy the produced executable into sqllib/function and you are ready to go.
For convenience, running the following files in CONNECT_BY.zip when connected to a test database will reproduce all the DB2 SQL in this article:
- db2 -tvf connect_by_functions.sql
- db2 -tvf connect_by_sample.sql
Setting up the example
To start, we will define and fill an employee table. The table has an employee ID, which is the primary key, a manager ID, which links the employee to his manager, and name and salary columns:
1 CREATE TABLE emp(empid INTEGER NOT NULL PRIMARY KEY, 2 name VARCHAR(10), 3 salary DECIMAL(9, 2), 4 mgrid INTEGER); 5 INSERT INTO emp 6 VALUES ( 1, 'Jones', 30000, 10), 7 ( 2, 'Hall', 35000, 10), 8 ( 3, 'Kim', 40000, 10), 9 ( 4, 'Lindsay', 38000, 10), 10 ( 5, 'McKeough', 42000, 11), 11 ( 6, 'Barnes', 41000, 11), 12 ( 7, 'O''Neil', 36000, 12), 13 ( 8, 'Smith', 34000, 12), 14 ( 9, 'Shoeman', 33000, 12), 15 (10, 'Monroe', 50000, 15), 16 (11, 'Zander', 52000, 16), 17 (12, 'Henry', 51000, 16), 18 (13, 'Aaron', 54000, 15), 19 (14, 'Scott', 53000, 16), 20 (15, 'Mills', 70000, 17), 21 (16, 'Goyal', 80000, 17), 22 (17, 'Urbassek', 95000, NULL);
Looking at the inserted rows, it is apparent that employee 'Urbassek' is the top manager for whom 'Goyal' and 'Mills' work. 'Scott' in turn works for 'Goyal' and has no employees of his own. 'Monroe,' on the other hand, manages 'Hall,' 'Kim,' and 'Lindsay' and is an employee of 'Mills.'
A reasonable question to ask is: "Who works directly or indirectly for 'Goyal'?"
A simple recursive query
To answer the question of who works for 'Goyal,' an Oracle developer might write the following query:
1 SELECT name 2 FROM emp 3 START WITH name = 'Goyal' 4 CONNECT BY PRIOR empid = mgrid
START WITH denotes the seed of the recursion while CONNECT BY describes the recursive step. That is how to get from step n to step (n + 1). Since it is important to distinguigh between the nth and the (n + 1)th step during name resolution, PRIOR is used to show that empid belongs to the nth step while mgrid belongs to step (n + 1)th. So with empid being 16 for step 1, mgrid must be 16 as well and hence step 2 produces 'Scott,' 'Henry,' and 'Zander.' Their empids will now serve as PRIOR to step 3, and so on and so forth.
The Oracle syntax is very concise. The SQL standard and DB2 syntax does not use any recursion-specific keywords and uses regular SQL to describe the exact same relationships. As you will see, it is more verbose, but equally straightforward:
1 WITH n(empid, name) AS 2 (SELECT empid, name 3 FROM emp 4 WHERE name = 'Goyal' 5 UNION ALL 6 SELECT nplus1.empid, nplus1.name 7 FROM emp as nplus1, n 8 WHERE n.empid = nplus1.mgrid) 9 SELECT name FROM n;
The WITH clause allows for the definition of a named query within a statement, which can be referred to at a later point in the same statement. Its technical term is a common table expression (CTE). CTEs are by now supported in most major SQL-based DBMS, including Oracle 9i, SQL Server 2005, and, of course, in DB2.
What makes this CTE special is that it is referred to within its very own defintion. This is what distinguishes a regular CTE from a recursive CTE. Here I named the CTE n to correlate to the recursive steps. A recursive CTE consists of two parts combined with a UNION ALL:
- The seed or step 1 of the recursion. This is what is described in Oracle using START WITH. In a recursive CTE, it is simply any query providing a set of rows. In this case we query the emp table and filter for 'Goyal.' We select the name of course and also the empid, because we need it for the recursive step.
- The recursive step going from n to (n + 1). Here we refer to step n (the CTE n) and join in step (n + 1) using the same predicate used in CONNECT BY. Instead of PRIOR, regular correlation names are used to distinguish n from (n + 1).
It is noteworthy to look at the output of this query:
NAME ---------- SQL0347W The recursive common table expression "SRIELAU.N" may contain an infinite loop. SQLSTATE=01605 Goyal Zander Henry Scott McKeough Barnes O'Neil Smith Shoeman 9 record(s) selected with 1 warning messages printed.
DB2 does not perform any automatic cycle checking. That is why it will return this warning when it determines that cycles are possible. Those DB2 developers who care usually add a level column that increments and is limited by a fixed value. DB2 recognises this pattern and will suppress the warning. For the purpose of mapping CONNECT BY, we will ignore the warning from now on since we will add a cycle check later on.
Further, it is interesting to note that the order in which DB2 has returned the results exposes how the recursion is processed one level at a time, instead of using a tree-walk.
Before moving on to pseudo columns and more complex examples, I want to briefly explain where the WHERE predicate of an Oracle recursion needs to be placed. We will modify the example to return all employees, including their salaries, working for 'Goyal' who earn more than 40,000.
1 SELECT name, salary 2 FROM emp 3 WHERE salary > 40000 4 START WITH name = 'Goyal' 5 CONNECT BY PRIOR empid = mgrid
It is interesting to note how the WHERE clause preceded the recursive specification. For a DB2 developer, it might seem that the predicate belongs to the set of rows being considered to begin with. This, however, is not correct. Instead, the WHERE filters the final result and belongs at the end of the matching DB2 query:
1 WITH n(empid, name, salary) AS 2 (SELECT empid, name, salary 3 FROM emp 4 WHERE name = 'Goyal' 5 UNION ALL 6 SELECT nplus1.empid, nplus1.name, nplus1.salary 7 FROM emp as nplus1, n 8 WHERE n.empid = nplus1.mgrid) 9 SELECT name FROM n WHERE salary > 40000; NAME SALARY ---------- ----------- Goyal 80000.00 Zander 52000.00 Henry 51000.00 Scott 53000.00 McKeough 42000.00 Barnes 41000.00 6 record(s) selected
Be aware that what is true for the WHERE clause is not true for the ON clause. I will get back to this issue in a later example. First we will dive into pseudo columns.
LEVEL pseudo column
The most trivial of the pseudo columns is LEVEL. The purpose of this column is to show the number of the recursive step n that produced the row. In our running example, it indicates the levels of management between 'Goyal' and the employee plus 1 (because LEVEL starts with 1). Here is the original Oracle example enhanced with LEVEL:
1 SELECT LEVEL, name 2 FROM emp 3 START WITH name = 'Goyal' 4 CONNECT BY PRIOR empid = mgrid
The SQL standard saw no need to add syntax for this feature because it can be expressed using regular SQL:
1 WITH n(level, empid, name) AS 2 (SELECT 1, empid, name 3 FROM emp 4 WHERE name = 'Goyal' 5 UNION ALL 6 SELECT n.level + 1, nplus1.empid, nplus1.name 7 FROM emp as nplus1, n 8 WHERE n.empid = nplus1.mgrid) 9 SELECT level, name FROM n; LEVEL NAME ----------- ---------- 1 Goyal 2 Zander 2 Henry 2 Scott 3 McKeough 3 Barnes 3 O'Neil 3 Smith 3 Shoeman 9 record(s) selected
All I have done here is to introduce a level column, which starts with 1 and increments by 1. Of course, any semantics is possible, but this one happens to provide the same semantics as LEVEL. It is up to the developer to choose which information to pass along and how as the next example will show.
The CONNECT_BY_ROOT expression
CONNECT_BY_ROOT operates on a column and returns the value of the root ancestor of the row. To show what it does and how it gets translated, I change the example to use 'Goyal's' direct reports as root:
1 SELECT CONNECT_BY_ROOT name AS root, name 2 FROM emp 3 START WITH name IN ('Zander', 'Henry', 'Scott') 4 CONNECT BY PRIOR empid = mgrid
To map to DB2, all that is required is to pass along the root unmodified.
1 WITH n(empid, root, name) AS 2 (SELECT empid, name, name 3 FROM emp 4 WHERE name IN ('Zander', 'Henry', 'Scott') 5 UNION ALL 6 SELECT nplus1.empid, n.root, 7 nplus1.name 8 FROM emp as nplus1, n 9 WHERE n.empid = nplus1.mgrid) 10 SELECT root, name FROM n; ROOT NAME ---------- ---------- Zander Zander Henry Henry Scott Scott Zander McKeough Zander Barnes Henry O'Neil Henry Smith Henry Shoeman 8 record(s) selected
Next, I will show how to present the whole management chain and not just the root.
The SYS_CONNECT_BY_PATH() procedure
A common question when running queries is: "How does this element relate to the beginning of the recursion?" or, in other words, "What are the ancestors of this row?" In Oracle, SYS_CONNECT_BY_PATH() can be used to concatenate values from each recursive step and thus build an ancestor path. Find below the by now well-known example showing for each employee of 'Goyal' the "reports to" chain:
1 SELECT SYS_CONNECT_BY_PATH(name, ':') AS chain 2 FROM emp 3 START WITH name = 'Goyal' 4 CONNECT BY PRIOR empid = mgrid
In this example, each value is separated by a ':'. To answer this question in DB2, the same technique is used as for LEVEL. The only difference is that concatenation is used instead of addition.
1 WITH n(empid, chain) AS 2 (SELECT empid, CAST(name AS VARCHAR(30)) 3 FROM emp 4 WHERE name = 'Goyal' 5 UNION ALL 6 SELECT nplus1.empid, 7 n.chain || ':' || nplus1.name 8 FROM emp as nplus1, n 9 WHERE n.empid = nplus1.mgrid) 10 SELECT chain FROM n; CHAIN ------------------------------ Goyal Goyal:Zander Goyal:Henry Goyal:Scott Goyal:Zander:McKeough Goyal:Zander:Barnes Goyal:Henry:O'Neil Goyal:Henry:Smith Goyal:Henry:Shoeman 9 record(s) selected
There is complete freedom on how to build the chain. For instance, the concatenation could be reversed to put 'Goyal' at the end of the chain. Also note the CAST that has been performed. It is needed for string data types so DB2 knows how long the chain can grow. The rules for recursive CTE specify that the length of the data types shall be defined by the seed, or the first half of the UNION ALL query.
So far mapping from CONNECT BY to UNION ALL has been a breeze. But now that you are comfortable with recursive CTE, it is time to crank up the volume and attack the harder problems. One such problem is ordering. If all that is required is to order the result in the same way as the elements in the path that is selected back anyway, then an ORDER BY on the path in the final SELECT will suffice:
1 WITH n(empid, chain) AS 2 (SELECT empid, CAST(name AS VARCHAR(30)) 3 FROM emp 4 WHERE name = 'Goyal' 5 UNION ALL 6 SELECT nplus1.empid, 7 n.chain || ':' || nplus1.name 8 FROM emp as nplus1, n 9 WHERE n.empid = nplus1.mgrid) 10 SELECT chain FROM n 11 ORDER BY chain; CHAIN ------------------------------ Goyal Goyal:Henry Goyal:Henry:O'Neil Goyal:Henry:Shoeman Goyal:Henry:Smith Goyal:Scott Goyal:Zander Goyal:Zander:Barnes Goyal:Zander:McKeough 9 record(s) selected
This is one form of the common org-chart. But what if the required order is on some other column and it is not a string?
ORDER SIBLINGS BY expression
In Oracle ORDER SIBLINGS BY defines the order in which siblings under the same parent shall be returned. To order all employees recursively by their salary using CONNECT BY, the query is formulated like this:
1 SELECT name, level, salary 2 FROM emp 3 START WITH name = 'Goyal' 4 CONNECT BY PRIOR empid = mgrid 5 ORDER SIBLINGS BY salary
The SQL Standard has specific syntax for this requirement. However, DB2 UDB for LUW has not (yet) implemented this part of the SQL standard. To map this feature in DB2, it is time to get inventive. On a case-by-case basis, it is possible to build a path again and then sort the string, but this can get a bit ugly since numeric values need to be converted to strings, which increases the length of the path. Also subsequent mappings will show that it is beneficial to provide a generic solution.
At this point, I therefore introduce a user defined function (UDF). A UDF is an extension to DB2's capabilities. In this case the UDF (and those following below) is implemented in C and the executable is available to DB2, if you have followed the Installation chapter above.
1 CREATE FUNCTION CONNECT_BY_POS 2 (POS VARCHAR(256) FOR BIT DATA, 3 N INTEGER, 4 POSPLUS1 BIGINT) 5 RETURNS VARCHAR(256) FOR BIT DATA 6 SPECIFIC CONNECT_BY_POS 7 EXTERNAL NAME 'connect_by!connect_by_pos' 8 NOT FENCED RETURNS NULL ON NULL INPUT 9 DETERMINISTIC NO SQL NO EXTERNAL ACTION 10 LANGUAGE C PARAMETER STYLE SQL ALLOW PARALLEL;
This function has two purposes:
- The function will take in the BIGINT value of the global position of the row posplus1 and translate this, given the pattern pos of the parent into the pattern for the next row. The level of the parent n is used to speed this process up. In principle the function does nothing other than building a binary-encoded path. The provided implementation uses integers internally. So, the arbitrarily chosen length of 256 for the binary string allows for 64 nesting levels, which can be trivially expanded.
- As an additional feature, the function also tests whether the global position of the new row already appears in the parent path. If that is the case, a runtime error is raised. Oracle does not allow cycles unless explicitly specified. So this feature makes the DB2 query compatible.
The usage is best described by showing the example matching the CONNECT BY version above:
1 WITH source(empid, name, salary, mgrid, rownum) AS 2 (SELECT empid, name, salary, mgrid, 3 ROW_NUMBER() OVER(ORDER BY salary) 4 FROM emp), 5 n(empid, name, salary, level, pos) AS 6 (SELECT empid, name, salary, 1, 7 CONNECT_BY_POS('', 0, rownum) 8 FROM source 9 WHERE name = 'Goyal' 10 UNION ALL 11 SELECT nplus1.empid, nplus1.name, 12 nplus1.salary, level + 1, 13 CONNECT_BY_POS(n.pos, level, rownum) 14 FROM source AS nplus1, n 15 WHERE n.empid = nplus1.mgrid) 16 SELECT name, level, salary 17 FROM n 18 ORDER BY pos; NAME LEVEL SALARY ---------- ----------- ----------- Goyal 1 80000.00 Henry 2 51000.00 Shoeman 3 33000.00 Smith 3 34000.00 O'Neil 3 36000.00 Zander 2 52000.00 Barnes 3 41000.00 McKeough 3 42000.00 Scott 2 53000.00 9 record(s) selected
What just happened? Adding another CTE source allows us to issue a global order for all the rows according to the one requested by ORDER SIBLINGS BY. Once this is done, the recursive CTE n refers to source to pick up the rownum. The rest of the query has already been explained. In essence, CONNECT_BY_POS() produces a PATH as a binary string not unlike what SYS_CONNECT_BY_PATH() does as a regular string. In the end, the query gets ordered by the path that provides the position within the hierarchy pos.
At this point, I want to come back to the possibility of an ON-clause in the CONNECT BY query. Any joins, or even more complex queries, can easily be expressed in the same way by simply pushing the composition of the source for the recursion into a CTE.
The NOCYCLE keyword
Now that I have introduced the capability to record the path of a recursion and to detect recursion at runtime, it is possible to map the NOCYCLE keyword of Oracle's CONNECT BY query syntax. In short, NOCYCLE prevents the recursion from entering a cycle. Any candidate row for step (n + 1) that already exists in the ancestor path is simply ignored. In Oracle, the usage looks like this:
1 SELECT name, level, salary 2 FROM emp 3 START WITH name = 'Goyal' 4 CONNECT BY NOCYCLE PRIOR empid = mgrid
Given the existence of the ancestor path, mapping the semantics is pretty simple. A second UDF will take care of it. The UDF will return 0 if a cycle is detected. If none is detected, the value 1 is returned. With the registration of the UDF I will also introduce a cycle into the management chain. 'Urbassek' shall now report to 'O'Neil.'
1 CREATE FUNCTION CONNECT_BY_NOCYCLE 2 (N VARCHAR(256) FOR BIT DATA, 3 LEVEL INTEGER, 4 NPLUS1 BIGINT) 5 RETURNS SMALLINT 6 SPECIFIC CONNECT_BY_NOCYCLE 7 EXTERNAL NAME 'connect_by!connect_by_nocycle' 8 NOT FENCED RETURNS NULL ON NULL INPUT 9 DETERMINISTIC NO SQL NO EXTERNAL ACTION 10 LANGUAGE C PARAMETER STYLE SQL ALLOW PARALLEL; 11 UPDATE emp SET mgrid = 7 WHERE name = 'Urbassek'; 12 WITH source(empid, name, salary, mgrid, rownum) AS 13 (SELECT empid, name, salary, mgrid, 14 ROW_NUMBER() OVER(ORDER BY salary) 15 FROM emp), 16 n(empid, name, salary, level, pos) AS 17 (SELECT empid, name, salary, 1, 18 CONNECT_BY_POS('', 0, rownum) 19 FROM source 20 WHERE name = 'Goyal' 21 UNION ALL 22 SELECT nplus1.empid, nplus1.name, 23 nplus1.salary, level + 1, 24 CONNECT_BY_POS(n.pos, level, rownum) 25 FROM source AS nplus1, n 26 WHERE n.empid = nplus1.mgrid 27 AND CONNECT_BY_NOCYCLE(n.pos, level, rownum) = 0 27 SELECT name, level, salary 28 FROM n 29 ORDER BY pos; NAME LEVEL SALARY ---------- ----------- ----------- Goyal 1 80000.00 Henry 2 51000.00 Shoeman 3 33000.00 Smith 3 34000.00 O'Neil 3 36000.00 Urbassek 4 95000.00 Mills 5 70000.00 Monroe 6 50000.00 Jones 7 30000.00 Hall 7 35000.00 Lindsay 7 38000.00 Kim 7 40000.00 Aaron 6 54000.00 Zander 2 52000.00 Barnes 3 41000.00 McKeough 3 42000.00 Scott 2 53000.00 17 record(s) selected
As you can see, 'Urbassek' has been added to the output, including all of 'Mills's' employees. 'Goyal' has not been added again, however. This is pretty good, but how do you find out where the cycle is?
For the purpose of "coloring" a cycle in the recursion, Oracle provides yet another pseudo column, CONNECT_BY_ISCYCLE. This column returns 1 if the row is part of a cycle and 0 if it is not.
1 SELECT name, level, 2 CONNECT_BY_ISCYCLE AS cycle 3 FROM emp 4 START WITH name = 'Goyal' 5 CONNECT BY NOCYCLE PRIOR empid = mgrid
To map this feature to DB2, I will once more introduce a UDF. CONNECT_BY_ISCYCLE() detects a cycle, just like CONNECT_BY_NOCYCLE() did. But when CONNECT_BY_ISCYCLE() detects a cycle, it also compares the given path to that of another path. If the two paths overlap within the cycle, a 1 is returned; otherwise a 0 is returned. Here is the declaration of the UDF:
1 CREATE FUNCTION CONNECT_BY_ISCYCLE 2 (POS VARCHAR(256) FOR BIT DATA, 3 LEVELPOS INTEGER, 4 NPLUS1 BIGINT, 5 POSANCESTOR VARCHAR(256) FOR BIT DATA, 6 LEVELANCESTOR INTEGER) 7 RETURNS SMALLINT 8 SPECIFIC CONNECT_BY_ISCYCLE 9 EXTERNAL NAME 'connect_by!connect_by_iscycle' 10 NOT FENCED RETURNS NULL ON NULL INPUT 11 DETERMINISTIC NO SQL NO EXTERNAL ACTION 12 LANGUAGE C PARAMETER STYLE SQL ALLOW PARALLEL;
To achieve the goal, the new query has to:
- Return all rows from the recursion
- Join in child rows of each row for comparions
- Drive one recursive step for each child row to detect whether that step leads to a cycle
In the end, what is needed is a three-way join: one self join on the output of the recursion, and a reference to the numbered source, similar to what was done in the original recursion. There is one advantage, though: It is sufficient to find any child row that introduces recursion. There is no need to find them all, and indeed, it would be harmful, because we want to return each row only once. Note that with a small change, one could count in how many cycles a given row participates. I'll leave that riddle for the interested reader to figure out on his own.
1 WITH source(empid, name, salary, mgrid, rownum) AS 2 (SELECT empid, name, salary, mgrid, 3 ROW_NUMBER() OVER(ORDER BY salary) 4 FROM emp), 5 n(empid, name, mgrid, level, pos) AS 6 (SELECT empid, name, mgrid, 1, 7 CONNECT_BY_POS('', 0, rownum) 8 FROM source 9 WHERE name = 'Goyal' 10 UNION ALL 11 SELECT nplus1.empid, 12 nplus1.name, 13 nplus1.mgrid, 14 level + 1, 15 CONNECT_BY_POS(n.pos, level, rownum) 16 FROM source AS nplus1, n 17 WHERE n.empid = nplus1.mgrid 18 AND CONNECT_BY_NOCYCLE(n.pos, level, rownum) = 0), 19 cycles(name, level, pos, iscycle) AS 20 (SELECT name, level, pos, 21 COALESCE((SELECT 1 AS iscycle 22 FROM n 23 LEFT OUTER JOIN source 24 ON n.mgrid = source.empid 25 WHERE CONNECT_BY_ISCYCLE(n.pos, 26 n.level, 27 source.rownum, 28 ancestor.pos, 29 ancestor.level) = 1 30 FETCH FIRST ROW ONLY), 0) AS iscycle 31 FROM n AS ancestor) 32 SELECT name, level, iscycle 33 FROM cycles 34 ORDER BY pos; NAME LEVEL ISCYCLE ---------- ----------- ----------- Goyal 1 1 Henry 2 1 Shoeman 3 0 Smith 3 0 O'Neil 3 1 Urbassek 4 1 Mills 5 1 Monroe 6 0 Jones 7 0 Hall 7 0 Lindsay 7 0 Kim 7 0 Aaron 6 0 Zander 2 0 Barnes 3 0 McKeough 3 0 Scott 2 0 17 record(s) selected
OK: This one was a piece of work, but at least now you know why you read that far. As in any good story, it is now time to add the Hollywood ending, where everything is nice and easy.
Compared to CONNECT_BY_ISCYCLE, CONNECT_BY_ISLEAF is indeed lighter fare. All this pseudo column does is to return 1 if a given row is a leaf in recursion. In other words: The row did not produce any further rows. In this example, a leaf is an employee who is not a manager.
1 SELECT name, level, 2 CONNECT_BY_ISLEAF AS isleaf 3 FROM emp 4 START WITH name = 'Goyal' 5 CONNECT BY PRIOR empid = mgrid
Indeed, all that is required here is to take out most of the previous additions. For the purpose of readability, we start fresh and 'Urbassek' becomes her own boss again. Also note that I took out the NOCYCLE option above to concentrate on the matter at hand. After all, there is no need to dwell on the past, especially when it was successfully mastered. By now the query practically writes itself.
1 UPDATE emp SET mgrid = NULL WHERE name = 'Urbassek'; 2 WITH n(empid, name) AS 3 (SELECT empid, name 4 FROM emp 5 WHERE name = 'Goyal' 6 UNION ALL 7 SELECT nplus1.empid, nplus1.name 8 FROM emp as nplus1, n 9 WHERE n.empid = nplus1.mgrid) 10 SELECT name, 11 COALESCE((SELECT 0 FROM emp 12 WHERE n.empid = emp.mgrid 13 FETCH FIRST ROW ONLY), 1) AS isleaf 11 FROM n; NAME ISLEAF ---------- ----------- Goyal 0 Zander 0 Henry 0 Scott 1 McKeough 1 Barnes 1 O'Neil 1 Smith 1 Shoeman 1 9 record(s) selected
This was indeed quite painless. Unless Oracle has introduced yet another pseudo column or keywords that I'm not aware of, all of Oracle's recursive features have now been mapped to DB2 UDB for Linux, UNIX, and Windows.
In this article I provided generic mappings from the Oracle style CONNECT BY recursive query syntax to DB2's standard compliant recursive common table expressions using UNION ALL. All the individual mappings work well together and I have shown that DB2's recursion feature is as least as expressive.
While the Oracle syntax is less verbose because it provides keywords for various common semantics, this also means that it is less expressive since little new semantics can be added without changes to the DBMS.
As an ISV or end-customer trying to port an application from Oracle to DB2, you probably do not care too much about that at this point. All you want and need to know now is how to translate from Oracle to DB2. Showing this has been the main purpose of this article. Once you get comfortable with the SQL standard syntax, I'm confident you will find value in DB2's expressiveness, even if it means typing a few more lines.
Thanks to the countless developers, both the IBMers and customers, who share their SQL challenges with me. I learn more from you than you from me.
- Source code for this article: Download the source code and try it out.
- IBM DB2 Universal Database Enterprise Server Edition V8.2 for Linux, UNIX, and Windows: Download a trial version from developerWorks.
- SQL PL Profiler: Learn how to tune your SQL procedures.
- SQL Procedure Tracing: Learn about a new API, including example SQL wrappers that you can use to investigate SQL Procedure logic errors.
- SQL Procedures Performance: Hints and Tips: This is an article written by Gustavo Arocena on tuning SQL procedures prior to DB2 V8.2.
- DB2 SQL PL Essential Guide by Paul Yip, et al.: This is the definitive book on SQL Procedures Language.
- DB2 Information Center: The DB2 documentation is online.
- Wish you could Google DB2 docs? You can: CTE site:boulder.ibm.com