A hierarchical query is a form of recursive query that retrieves a hierarchy, such as a bill of materials, from relational data by using a CONNECT BY clause.
db2set DB2_COMPATIBILITY_VECTOR=08
db2stop
db2start
To take full advantage of the DB2® compatibility features for Oracle applications,
the recommended setting for the DB2_COMPATIBILITY_VECTOR is ORA, which
sets all of the compatibility bits.You can then use CONNECT BY syntax, including pseudocolumns, unary operators, and the SYS_CONNECT_BY_PATH scalar function.
A hierarchical query contains a CONNECT BY clause that defines the join conditions between parent and child elements. Connect-by recursion uses the same subquery for the seed (START WITH clause) and the recursive step (CONNECT BY clause). This combination provides a concise method of representing recursions such as bills-of-material, reports-to-chains, or email threads.
Connect-by recursion returns an error if a cycle occurs. A cycle occurs when a row produces itself, either directly or indirectly. By using the optional CONNECT BY NOCYCLE clause, you can direct the recursion to ignore the duplicated row, thus avoiding both the cycle and the error. Hierarchical queries or connect-by recursion differs from DB2 recursion. For more information about the differences, see Port CONNECT BY to DB2.
A subselect that includes a hierarchical-query-clause is called a hierarchical query.
>>-●--+-----------------------+--●--| connect-by-clause |--●--->< '-| start-with-clause |-' start-with-clause |--START WITH--search-condition---------------------------------| connect-by-clause |--CONNECT BY--+---------+--search-condition--------------------| '-NOCYCLE-'
After a first intermediate result table H1 is established, subsequent intermediate result tables H2, H3, and so forth are generated. The subsequently created intermediate result tables are generated by joining Hn with table R using the connect-by-clause as a join condition to produce Hn+1. R is the result of the FROM clause of the subselect and any join predicates in the WHERE clause. The process stops when Hn+1 yields an empty result table. The result table H of the hierarchical-query-clause is the result as if UNION ALL were applied for every intermediate result table..
CONNECT BY MGRID = PRIOR EMPID
MGRID
is resolved with R, and EMPID is resolved within the columns of the
previous intermediate result table Hn.If you write a hierarchical query involving joins, use explicit joined tables with an ON clause to avoid confusion about the application of WHERE clause predicates.
LEVEL is a pseudocolumn for use in hierarchical queries. The LEVEL pseudocolumn returns the recursive step in the hierarchy at which a row was produced. All rows that are produced by the START WITH clause return the value 1. Rows that are produced by applying the first iteration of the CONNECT BY clause return 2, and so on. The data type of the column is INTEGER NOT NULL.
You must specify LEVEL in the context of a hierarchical query. You cannot specify LEVEL in the START WITH clause, as an argument of the CONNECT_BY_ROOT operator, or as an argument of the SYS_CONNECT_BY_PATH function (SQLSTATE 428H4).
CREATE TABLE MY_EMP(
EMPID INTEGER NOT NULL PRIMARY KEY,
NAME VARCHAR(10),
SALARY DECIMAL(9, 2),
MGRID INTEGER);
INSERT INTO MY_EMP VALUES ( 1, 'Jones', 30000, 10);
INSERT INTO MY_EMP VALUES ( 2, 'Hall', 35000, 10);
INSERT INTO MY_EMP VALUES ( 3, 'Kim', 40000, 10);
INSERT INTO MY_EMP VALUES ( 4, 'Lindsay', 38000, 10);
INSERT INTO MY_EMP VALUES ( 5, 'McKeough', 42000, 11);
INSERT INTO MY_EMP VALUES ( 6, 'Barnes', 41000, 11);
INSERT INTO MY_EMP VALUES ( 7, 'O''Neil', 36000, 12);
INSERT INTO MY_EMP VALUES ( 8, 'Smith', 34000, 12);
INSERT INTO MY_EMP VALUES ( 9, 'Shoeman', 33000, 12);
INSERT INTO MY_EMP VALUES (10, 'Monroe', 50000, 15);
INSERT INTO MY_EMP VALUES (11, 'Zander', 52000, 16);
INSERT INTO MY_EMP VALUES (12, 'Henry', 51000, 16);
INSERT INTO MY_EMP VALUES (13, 'Aaron', 54000, 15);
INSERT INTO MY_EMP VALUES (14, 'Scott', 53000, 16);
INSERT INTO MY_EMP VALUES (15, 'Mills', 70000, 17);
INSERT INTO MY_EMP VALUES (16, 'Goyal', 80000, 17);
INSERT INTO MY_EMP VALUES (17, 'Urbassek', 95000, NULL);
1 SELECT NAME,
2 LEVEL,
3 SALARY,
4 CONNECT_BY_ROOT NAME AS ROOT,
5 SUBSTR(SYS_CONNECT_BY_PATH(NAME, ':'), 1, 25) AS CHAIN
6 FROM MY_EMP
7 START WITH NAME = 'Goyal'
8 CONNECT BY PRIOR EMPID = MGRID
9 ORDER SIBLINGS BY SALARY;
NAME LEVEL SALARY ROOT CHAIN
---------- ----------- ----------- ----- ---------------
Goyal 1 80000.00 Goyal :Goyal
Henry 2 51000.00 Goyal :Goyal:Henry
Shoeman 3 33000.00 Goyal :Goyal:Henry:Shoeman
Smith 3 34000.00 Goyal :Goyal:Henry:Smith
O'Neil 3 36000.00 Goyal :Goyal:Henry:O'Neil
Zander 2 52000.00 Goyal :Goyal:Zander
Barnes 3 41000.00 Goyal :Goyal:Zander:Barnes
McKeough 3 42000.00 Goyal :Goyal:Zander:McKeough
Scott 2 53000.00 Goyal :Goyal:Scott
Lines 7 and 8 comprise the core of the recursion: The optional START WITH clause describes the WHERE clause that is to be used on the source table to seed the recursion. In this case, only the row for employee Goyal is selected. If the START WITH clause is omitted, the entire source table is used to seed the recursion. The CONNECT BY clause describes how, given the existing rows, the next set of rows is to be found. The unary operator PRIOR is used to distinguish values in the previous step from those in the current step. PRIOR identifies EMPID as the employee ID of the previous recursive step, and MGRID as originating from the current recursive step.
The LEVEL pseudocolumn in line 2 indicates the current level of recursion.
CONNECT_BY_ROOT is a unary operator that always returns the value of its argument as it was during the first recursive step; that is, the values that are returned by an explicit or implicit START WITH clause.
SYS_CONNECT_BY_PATH() is a binary function that prepends the second argument to the first and then appends the result to the value that it produced in the previous recursive step. The arguments must be character types.
Unless explicitly overridden, connect-by recursion returns a result set in a partial order; that is, the rows that are produced by a recursive step always follow the row that produced them. Siblings at the same level of recursion have no specific order. The ORDER SIBLINGS BY clause in line 9 defines an order for these siblings, which further refines the partial order, potentially into a total order.
SELECT LEVEL, CAST(SPACE((LEVEL - 1) * 4) || '/' || DEPTNAME
AS VARCHAR(40)) AS DEPTNAME
FROM DEPARTMENT
START WITH DEPTNO = 'A00'
CONNECT BY NOCYCLE PRIOR DEPTNO = ADMRDEPT
The
query returns:LEVEL DEPTNAME
----------- ----------------------------------------
1 /SPIFFY COMPUTER SERVICE DIV.
2 /PLANNING
2 /INFORMATION CENTER
2 /DEVELOPMENT CENTER
3 /MANUFACTURING SYSTEMS
3 /ADMINISTRATION SYSTEMS
2 /SUPPORT SERVICES
3 /OPERATIONS
3 /SOFTWARE SUPPORT
3 /BRANCH OFFICE F2
3 /BRANCH OFFICE G2
3 /BRANCH OFFICE H2
3 /BRANCH OFFICE I2
3 /BRANCH OFFICE J2