Using OLAP specifications
Online analytical processing (OLAP) specifications provide the ability to return ranking, row numbering, and other aggregate function information as a scalar value in a query result.
Example: Ranking and row numbering
SELECT EMPNO, SALARY,
RANK() OVER(ORDER BY SALARY DESC),
DENSE_RANK() OVER(ORDER BY SALARY DESC),
ROW_NUMBER() OVER(ORDER BY SALARY DESC)
FROM EMPLOYEE
FETCH FIRST 10 ROWS ONLY
This query returns the following information.
EMPNO | SALARY | RANK | DENSE_RANK | ROW_NUMBER |
---|---|---|---|---|
000010 | 52,750.00 | 1 | 1 | 1 |
000110 | 46,500.00 | 2 | 2 | 2 |
200010 | 46,500.00 | 2 | 2 | 3 |
000020 | 41,250.00 | 4 | 3 | 4 |
000050 | 40,175.00 | 5 | 4 | 5 |
000030 | 38,250.00 | 6 | 5 | 6 |
000070 | 36,170.00 | 7 | 6 | 7 |
000060 | 32,250.00 | 8 | 7 | 8 |
000220 | 29,840.00 | 9 | 8 | 9 |
200220 | 29,840.00 | 9 | 8 | 10 |
In this example, the SALARY descending order with the top 10 returned. The RANK column shows the relative ranking of each salary. Notice that there are two rows with the same salary at position 2. Each of those rows is assigned the same rank value. The following row is assigned the value of 4. RANK returns a value for a row that is one more than the total number of rows that precede that row. There are gaps in the numbering sequence whenever there are duplicates.
In contrast, the DENSE_RANK column shows a value of 3 for the row directly after the duplicate rows. DENSE_RANK returns a value for a row that is one more than the number of distinct row values that precede it. There will never be gaps in the numbering sequence.
ROW_NUMBER returns a unique number for each row. For rows that contain duplicate values according to the specified ordering, the assignment of a row number is arbitrary; the row numbers could be assigned in a different order for the duplicate rows when the query is run another time.
Example: Ranking groups
SELECT WORKDEPT, INT(AVG(SALARY)) AS AVERAGE,
RANK() OVER(ORDER BY AVG(SALARY) DESC) AS AVG_SALARY,
NTILE(3) OVER(ORDER BY AVG(SALARY) DESC) AS QUANTILE
FROM EMPLOYEE
GROUP BY WORKDEPT
This query returns the following information.
WORKDEPT | AVERAGE | AVG_SALARY | QUANTILE |
---|---|---|---|
B01 | 41,250 | 1 | 1 |
A00 | 40,850 | 2 | 1 |
E01 | 40,175 | 3 | 1 |
C01 | 29,722 | 4 | 2 |
D21 | 25,668 | 5 | 2 |
D11 | 25,147 | 6 | 2 |
E21 | 24,086 | 7 | 3 |
E11 | 21,020 | 8 | 3 |
In this example, the NTILE function has an argument of 3, meaning that the results are to be grouped into 3 equal-sized sets. Since the result set is not evenly divisible by the number of quantiles, an additional row is included in each of the two lowest number quantiles.
Example: Ranking within a department
SELECT LASTNAME, WORKDEPT, BONUS,
DENSE_RANK() OVER(PARTITION BY WORKDEPT ORDER BY BONUS DESC)
AS BONUS_RANK_IN_DEPT
FROM EMPLOYEE
WHERE WORKDEPT LIKE 'E%'
This query returns the following information.
LASTNAME | WORKDEPT | BONUS | BONUS_RANK_
IN_DEPT |
---|---|---|---|
GEYER | E01 | 800.00 | 1 |
HENDERSON | E11 | 600.00 | 1 |
SCHNEIDER | E11 | 500.00 | 2 |
SCHWARTZ | E11 | 500.00 | 2 |
SMITH | E11 | 400.00 | 3 |
PARKER | E11 | 300.00 | 4 |
SETRIGHT | E11 | 300.00 | 4 |
SPRINGER | E11 | 300.00 | 4 |
SPENSER | E21 | 500.00 | 1 |
LEE | E21 | 500.00 | 1 |
GOUNOT | E21 | 500.00 | 1 |
WONG | E21 | 500.00 | 1 |
ALONZO | E21 | 500.00 | 1 |
MENTA | E21 | 400®.00 | 2 |
Example: Ranking and ordering by table expression results
Suppose that you want to find the top five employees whose salaries are the highest along with their department names. The department name is in the department table, so a join operation is needed. Because ordering is already being done in the nested table expression, that ordering can also be used for determining the ROW_NUMBER value. The ORDER BY ORDER OF table clause is used to do this.
SELECT ROW_NUMBER() OVER(ORDER BY ORDER OF EMP),
EMPNO, SALARY, DEPTNO, DEPTNAME
FROM (SELECT EMPNO, WORKDEPT, SALARY
FROM EMPLOYEE
ORDER BY SALARY DESC
FETCH FIRST 5 ROWS ONLY) EMP,
DEPARTMENT
WHERE DEPTNO = WORKDEPT
This query returns the following information.
ROW_NUMBER | EMPNO | SALARY | DEPTNO | DEPTNAME |
---|---|---|---|---|
1 | 000010 | 52,750.00 | A00 | SPIFFY COMPUTER SERVICE DIV. |
2 | 000110 | 46,500.00 | A00 | SPIFFY COMPUTER SERVICE DIV. |
3 | 200010 | 46,500.00 | A00 | SPIFFY COMPUTER SERVICE DIV. |
4 | 000020 | 41,250.00 | B01 | PLANNING |
5 | 000050 | 40,175.00 | E01 | SUPPORT SERVICES |
Example: Using OLAP Aggregates and CUME_DIST
Suppose that you want to find the rolling sum of the salaries for employees in department D11 and also the distribution of the salary.
SELECT ROW_NUMBER() OVER() AS ROW, LASTNAME, SALARY,
SUM(SALARY) OVER(ORDER BY SALARY
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ROLLING_TOTAL_RANGE,
SUM(SALARY) OVER(ORDER BY SALARY
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ROLLING_TOTAL_ROWS,
DECIMAL(CUME_DIST() OVER (ORDER BY SALARY),4,3) AS DISTRIBUTION
FROM EMPLOYEE
WHERE WORKDEPT = 'D11'
ORDER BY SALARY
This query returns the following information.
ROW | LASTNAME | SALARY | ROLLING_
TOTAL RANGE |
ROLLING_
TOTAL_ ROWS |
DISTRIBUTION |
---|---|---|---|---|---|
1 | JONES | 18,270.00 | 18,270.00 | 18,270.00 | .091 |
2 | WALKER | 20,450.00 | 38,720.00 | 38,720.00 | .182 |
3 | SCOUTTEN | 21,340.00 | 60,060.00 | 60,060.00 | .273 |
4 | PIANKA | 22,250.00 | 82,310.00 | 82,310.00 | .364 |
5 | YOSHIMURA | 24,680.00 | 131,670.00 | 106,990.00 | .545 |
6 | YAMAMOTO | 24,680.00 | 131,670.00 | 131,670.00 | .545 |
7 | ADAMSON | 25,280.00 | 156,950.00 | 156,950.00 | .636 |
8 | BROWN | 27,740.00 | 184,690.00 | 184,690.00 | .727 |
9 | LUTZ | 29,840.00 | 244,370.00 | 214,530.00 | .909 |
10 | JOHN | 29,840.00 | 244,370.00 | 244,370.00 | .909 |
11 | STERN | 32,250.00 | 276,620.00 | 276,620.00 | 1.000 |
This example shows two ways of defining the window to be used for calculating the value of a group.
The first way to define the window is with RANGE, which defines a group for all rows that have the same order by value. Row numbers 5 and 6 have the same salary value, so they are treated as a group. Their salaries are summed together and added to the previous total to generate the same value for each of the rows as seen in the ROLLING_TOTAL_RANGE column.
The second way to define the window is with ROWS, which treats each row as a group. In the ROLLING_TOTAL_ROWS column each row shows the sum calculated up to and including the current row. For rows that have the same salary value, such as rows 5 and 6, the order in which they are returned is not defined.
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is the default aggregation group and could be omitted from the ROLLING_TOTAL_RANGE specification.
An ORDER BY is specified for the entire query to guarantee the rows are returned ordered.
Example: Using a RANGE windowing specification
Suppose you want to further analyze the salaries for DEPT D11.
This example groups salaries together by windowing, using a range to work with salary values 1000 less than or greater than each row's salary. It also returns which salary is the first value and the last value for the group used to calculate the sum.
SELECT LASTNAME, SALARY,
SUM(SALARY) OVER(ORDER BY SALARY) AS ROLLING_TOTAL,
SUM(SALARY) OVER(ORDER BY SALARY
RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING) AS WINDOWED_TOTAL,
FIRST_VALUE(SALARY) OVER(ORDER BY SALARY
RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING),
LAST_VALUE(SALARY) OVER(ORDER BY SALARY
RANGE BETWEEN 1000 PRECEDING AND 1000 FOLLOWING)
FROM EMPLOYEE
WHERE WORKDEPT = 'D11'
ORDER BY SALARY
This query returns the following information.
LASTNAME | SALARY | ROLLING_
TOTAL |
WINDOWED_
TOTAL |
FIRST_VALUE | LAST_VALUE |
---|---|---|---|---|---|
JONES | 18,270.00 | 18,270.00 | 18,270.00 | 18,270.00 | 18,270.00 |
WALKER | 20,450.00 | 38,720.00 | 41,790.00 | 20,450.00 | 21,340.00 |
SCOUTTEN | 21,340.00 | 60,060.00 | 64,040.00 | 20,450.00 | 22,250.00 |
PIANKA | 22,250.00 | 82,310.00 | 43,590.00 | 21,340.00 | 25,280.00 |
YOSHIMURA | 24,680.00 | 131,670.00 | 74,640.00 | 24,680.00 | 25,280.00 |
YAMAMOTO | 24,680.00 | 131,670.00 | 74,640.00 | 24,680.00 | 25,280.00 |
ADAMSON | 25,280.00 | 156,950.00 | 74,640.00 | 24,680.00 | 25,280.00 |
BROWN | 27,740.00 | 184,690.00 | 27,740.00 | 27,740.00 | 27,740.00 |
LUTZ | 29,840.00 | 244,370.00 | 59,680.00 | 29,840.00 | 29,840.00 |
JOHN | 29,840.00 | 244,370.00 | 59,680.00 | 29,840.00 | 29,840.00 |
STERN | 32,250.00 | 276,620.00 | 32,250.00 | 32,250.00 | 32,250.00 |
For each employee, a group is defined that contains all other employees in department D11 with salaries that fall within a range of 1000 below (PRECEDING) or 1000 above (FOLLOWING) that employee's salary. The WINDOWED_TOTAL column returns the sum of all the salaries in that group. The FIRST_VALUE column returns the lowest salary value that is part of the group. The LAST_VALUE column returns the highest salary value that is part of the group. Any employee that has a salary that is more than 1000 from the closest other salary is its own group.
An ORDER BY is specified for the entire query to guarantee the rows are returned ordered.
Example: Using a ROWS windowing specification
This example groups salaries together by windowing, using ROWS to work with salary values that are one row before and after each employee salary. The average of the 3 rows is returned.
SELECT LASTNAME, SALARY,
DECIMAL(AVG(SALARY) OVER(ORDER BY SALARY
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)7,2) AS AVG_SALARY
FROM EMPLOYEE
WHERE WORKDEPT = 'D11'
ORDER BY SALARY
This query returns the following information.
LASTNAME | SALARY | AVG_SALARY |
---|---|---|
JONES | 18,270.00 | 19,360.00 |
WALKER | 20,450.00 | 20,020.00 |
SCOUTTEN | 21,340.00 | 21,346.66 |
PIANKA | 22,250.00 | 22,756.66 |
YOSHIMURA | 24,680.00 | 23,870.00 |
YAMAMOTO | 24,680.00 | 24,880.00 |
ADAMSON | 25,280.00 | 25,900.00 |
BROWN | 27,740.00 | 27,620.00 |
LUTZ | 29,840.00 | 29,140.00 |
JOHN | 29,840.00 | 30,643.33 |
STERN | 32,250.00 | 31,045.00 |