Db2 for i SQL: Using OLAP specifications

Example: Ranking and row numbering

Suppose that you want a list of the top 10 salaries along with their ranking. The following query generates the ranking number for you:

 SELECT EMPNO, SALARY, 
        RANK() OVER(ORDER BY SALARY DESC),
        DENSE_RANK() OVER(ORDER BY SALARY DESC),
        ROW_NUMBER() OVER(ORDER BY SALARY DESC)  
 FROM EMPLOYEE
 FETCH FIRST 10 ROWS ONLY

This query returns the following information.

Table 1. Results of the previous query
EMPNO	SALARY	RANK	DENSE_RANK	ROW_NUMBER
000010	52,750.00	1	1	1
000110	46,500.00	2	2	2
200010	46,500.00	2	2	3
000020	41,250.00	4	3	4
000050	40,175.00	5	4	5
000030	38,250.00	6	5	6
000070	36,170.00	7	6	7
000060	32,250.00	8	7	8
000220	29,840.00	9	8	9
200220	29,840.00	9	8	10

In this example, the SALARY descending order with the top 10 returned. The RANK column shows the relative ranking of each salary. Notice that there are two rows with the same salary at position 2. Each of those rows is assigned the same rank value. The following row is assigned the value of 4. RANK returns a value for a row that is one more than the total number of rows that precede that row. There are gaps in the numbering sequence whenever there are duplicates.

In contrast, the DENSE_RANK column shows a value of 3 for the row directly after the duplicate rows. DENSE_RANK returns a value for a row that is one more than the number of distinct row values that precede it. There will never be gaps in the numbering sequence.

ROW_NUMBER returns a unique number for each row. For rows that contain duplicate values according to the specified ordering, the assignment of a row number is arbitrary; the row numbers could be assigned in a different order for the duplicate rows when the query is run another time.

Example: Ranking groups

Suppose that you want to find out which department has the highest average salary Start of change

and the quantile of the average salary for each department. End of change

The following query groups the data by department, determines the average salary for each department, ranks the resulting averages, Start of change

and shows a quantile for the average salary End of change

.

SELECT WORKDEPT, INT(AVG(SALARY)) AS AVERAGE, 
         RANK() OVER(ORDER BY AVG(SALARY) DESC) AS AVG_SALARY, 
         NTILE(3) OVER(ORDER BY AVG(SALARY) DESC) AS QUANTILE 
   FROM EMPLOYEE 
   GROUP BY  WORKDEPT

This query returns the following information.

Table 2. Results of previous query
WORKDEPT	AVERAGE	AVG_SALARY	QUANTILE
B01	41,250	1	1
A00	40,850	2	1
E01	40,175	3	1
C01	29,722	4	2
D21	25,668	5	2
D11	25,147	6	2
E21	24,086	7	3
E11	21,020	8	3

In this example, the NTILE function has an argument of 3, meaning that the results are to be grouped into 3 equal-sized sets. Since the result set is not evenly divisible by the number of quantiles, an additional row is included in each of the two lowest number quantiles.

Example: Ranking within a department

Suppose that you want a list of employees along with how their bonus ranks within their department. Using the PARTITION BY clause, you can specify groups that are to be numbered separately.

SELECT LASTNAME, WORKDEPT, BONUS, 
        DENSE_RANK() OVER(PARTITION BY WORKDEPT ORDER BY BONUS DESC) 
          AS BONUS_RANK_IN_DEPT 
   FROM EMPLOYEE
   WHERE WORKDEPT LIKE 'E%'

This query returns the following information.

Table 3. Results of the previous query
LASTNAME	WORKDEPT	BONUS	BONUS_RANK_ IN_DEPT
GEYER	E01	800.00	1
HENDERSON	E11	600.00	1
SCHNEIDER	E11	500.00	2
SCHWARTZ	E11	500.00	2
SMITH	E11	400.00	3
PARKER	E11	300.00	4
SETRIGHT	E11	300.00	4
SPRINGER	E11	300.00	4
SPENSER	E21	500.00	1
LEE	E21	500.00	1
GOUNOT	E21	500.00	1
WONG	E21	500.00	1
ALONZO	E21	500.00	1
MENTA	E21	400®.00	2

Example: Ranking and ordering by table expression results

Suppose that you want to find the top five employees whose salaries are the highest along with their department names. The department name is in the department table, so a join operation is needed. Because ordering is already being done in the nested table expression, that ordering can also be used for determining the ROW_NUMBER value. The ORDER BY ORDER OF table clause is used to do this.

SELECT ROW_NUMBER() OVER(ORDER BY ORDER OF EMP),
             EMPNO, SALARY, DEPTNO, DEPTNAME      
   FROM (SELECT EMPNO, WORKDEPT, SALARY 
             FROM EMPLOYEE        
             ORDER BY SALARY DESC     
             FETCH FIRST 5 ROWS ONLY) EMP,
        DEPARTMENT 
   WHERE DEPTNO = WORKDEPT

This query returns the following information.

Table 4. Results of the previous query
ROW_NUMBER	EMPNO	SALARY	DEPTNO	DEPTNAME
1	000010	52,750.00	A00	SPIFFY COMPUTER SERVICE DIV.
2	000110	46,500.00	A00	SPIFFY COMPUTER SERVICE DIV.
3	200010	46,500.00	A00	SPIFFY COMPUTER SERVICE DIV.
4	000020	41,250.00	B01	PLANNING
5	000050	40,175.00	E01	SUPPORT SERVICES

Example: Using OLAP Aggregates and CUME_DIST

Suppose that you want to find the rolling sum of the salaries for employees in department D11 and also the distribution of the salary.

SELECT ROW_NUMBER() OVER() AS ROW, LASTNAME, SALARY,
     SUM(SALARY) OVER(ORDER BY SALARY 
         RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ROLLING_TOTAL_RANGE,
     SUM(SALARY) OVER(ORDER BY SALARY 
         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ROLLING_TOTAL_ROWS,
     DECIMAL(CUME_DIST() OVER (ORDER BY SALARY),4,3) AS DISTRIBUTION
  FROM EMPLOYEE
  WHERE WORKDEPT = 'D11'
  ORDER BY SALARY

This query returns the following information.

Table 5. Results of the previous query
ROW	LASTNAME	SALARY	ROLLING_ TOTAL RANGE	ROLLING_ TOTAL_ ROWS	DISTRIBUTION
1	JONES	18,270.00	18,270.00	18,270.00	.091
2	WALKER	20,450.00	38,720.00	38,720.00	.182
3	SCOUTTEN	21,340.00	60,060.00	60,060.00	.273
4	PIANKA	22,250.00	82,310.00	82,310.00	.364
5	YOSHIMURA	24,680.00	131,670.00	106,990.00	.545
6	YAMAMOTO	24,680.00	131,670.00	131,670.00	.545
7	ADAMSON	25,280.00	156,950.00	156,950.00	.636
8	BROWN	27,740.00	184,690.00	184,690.00	.727
9	LUTZ	29,840.00	244,370.00	214,530.00	.909
10	JOHN	29,840.00	244,370.00	244,370.00	.909
11	STERN	32,250.00	276,620.00	276,620.00	1.000

This example shows two ways of defining the window to be used for calculating the value of a group.

The first way to define the window is with RANGE, which defines a group for all rows that have the same order by value. Row numbers 5 and 6 have the same salary value, so they are treated as a group. Their salaries are summed together and added to the previous total to generate the same value for each of the rows as seen in the ROLLING_TOTAL_RANGE column.

The second way to define the window is with ROWS, which treats each row as a group. In the ROLLING_TOTAL_ROWS column each row shows the sum calculated up to and including the current row. For rows that have the same salary value, such as rows 5 and 6, the order in which they are returned is not defined.

RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is the default aggregation group and could be omitted from the ROLLING_TOTAL_RANGE specification.

An ORDER BY is specified for the entire query to guarantee the rows are returned ordered.

Example: Using a RANGE windowing specification

Suppose you want to further analyze the salaries for DEPT D11.

This example groups salaries together by windowing, using a range to work with salary values 1000 less than or greater than each row's salary. It also returns which salary is the first value and the last value for the group used to calculate the sum.

SELECT LASTNAME, SALARY,
       SUM(SALARY) OVER(ORDER BY SALARY) AS ROLLING_TOTAL, 
       SUM(SALARY) OVER(ORDER BY SALARY 
           RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING) AS WINDOWED_TOTAL,
       FIRST_VALUE(SALARY) OVER(ORDER BY SALARY 
           RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING),
       LAST_VALUE(SALARY) OVER(ORDER BY SALARY 
           RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING)
  FROM EMPLOYEE
  WHERE WORKDEPT = 'D11'
  ORDER BY SALARY

This query returns the following information.

Table 6. Results of the previous query
LASTNAME	SALARY	ROLLING_ TOTAL	WINDOWED_ TOTAL	FIRST_VALUE	LAST_VALUE
JONES	18,270.00	18,270.00	18,270.00	18,270.00	18,270.00
WALKER	20,450.00	38,720.00	41,790.00	20,450.00	21,340.00
SCOUTTEN	21,340.00	60,060.00	64,040.00	20,450.00	22,250.00
PIANKA	22,250.00	82,310.00	43,590.00	21,340.00	25,280.00
YOSHIMURA	24,680.00	131,670.00	74,640.00	24,680.00	25,280.00
YAMAMOTO	24,680.00	131,670.00	74,640.00	24,680.00	25,280.00
ADAMSON	25,280.00	156,950.00	74,640.00	24,680.00	25,280.00
BROWN	27,740.00	184,690.00	27,740.00	27,740.00	27,740.00
LUTZ	29,840.00	244,370.00	59,680.00	29,840.00	29,840.00
JOHN	29,840.00	244,370.00	59,680.00	29,840.00	29,840.00
STERN	32,250.00	276,620.00	32,250.00	32,250.00	32,250.00

For each employee, a group is defined that contains all other employees in department D11 with salaries that fall within a range of 1000 below (PRECEDING) or 1000 above (FOLLOWING) that employee's salary. The WINDOWED_TOTAL column returns the sum of all the salaries in that group. The FIRST_VALUE column returns the lowest salary value that is part of the group. The LAST_VALUE column returns the highest salary value that is part of the group. Any employee that has a salary that is more than 1000 from the closest other salary is its own group.

An ORDER BY is specified for the entire query to guarantee the rows are returned ordered.

Example: Using a ROWS windowing specification

This example groups salaries together by windowing, using ROWS to work with salary values that are one row before and after each employee salary. The average of the 3 rows is returned.

SELECT LASTNAME, SALARY,
       DECIMAL(AVG(SALARY) OVER(ORDER BY SALARY 
           ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)7,2) AS AVG_SALARY
  FROM EMPLOYEE
  WHERE WORKDEPT = 'D11'
  ORDER BY SALARY

This query returns the following information.

Table 7. Results of the previous query
LASTNAME	SALARY	AVG_SALARY
JONES	18,270.00	19,360.00
WALKER	20,450.00	20,020.00
SCOUTTEN	21,340.00	21,346.66
PIANKA	22,250.00	22,756.66
YOSHIMURA	24,680.00	23,870.00
YAMAMOTO	24,680.00	24,880.00
ADAMSON	25,280.00	25,900.00
BROWN	27,740.00	27,620.00
LUTZ	29,840.00	29,140.00
JOHN	29,840.00	30,643.33
STERN	32,250.00	31,045.00