LISTAGG.. the real deal
Comments (12) Visits (9704)
One of the most popular blog entries on SQLTips4DB2 is the one on Aggregating Strings (October, 29 2010).
In this post I describe a means to sum strings into a a single comma separated text using XML.
When I composed this post I was already working on a built-in function called LISTAGG.
But somehow I forgot to blog about it assuming that first blog entry already contained its description.
I must be getting old.
So, without further delay here is a quick overview of the LISTAGG function introduced in DB2 9.7.4.
We use the same sample data as last time around:
Let's find out about the average salary by department:CREATE TABLE emp(
I promised I would switch to CLPPlus and to introduce features of it as I come across them.select dept, avg(salary) as avgsal from emp group by dept;
Let's clean up all those zeros in AVGSAL and make the name nicer:
I have told CLPPLus that I want a leading $ and groups of thousands separated by commas followed by two digits after the decimal point.COLUMN AVGSAL FORMAT $999,999.00 JUSTIFY LEFT HEADING "AVG(SALARY)"
Now we finally want that comma separated list of the employees per department.
select dept, avg(salary) as avgsal, listagg(name, ',') within group(order by name) as emps from emp group by dept;
I don't like the wrapping. Let's just print the first 20 bytes and truncate the rest
That works great for this simple example, but how big can the text get?COLUMN EMPS FORMAT A20 TRUNCATED
By default DB2 will make the result of LISTAGG a VARCHAR(4000) (or a VARGRAPHIC(2000)).
If the first argument is bigger however DB2 will take it's derived size to derive the size of the result.
So if you expect really big departments you could do:
This would tell DB2 to hold space for up to 10000 bytes.LISTAGG(CAST(name AS VARCHAR(10000)), ',')
Note that at this point there is no support for CLOBs, so if you need to aggregate large objects the previously discussed XML solution may still have its merits.
As for the other parameters of LISTAGG the separator can be freely chosen. If you omit it the comma (",") is being used.
WITHIN GROUP is optional and is used to force an order within the result string.
Here is an example with a slightly different setting:
Truncation is kicking in. We need to increase the width of the row and relax the format of the column.select dept, avg(salary) as avgsal, listagg(name, ' earns more than ') within group(order by salary desc) as emps from emp group by dept;
SET LINESIZE 120