LISTAGG aggregate function

The LISTAGG function aggregates a set of strings into one string by concatenating the strings. Optionally, a separator string can be provided which is inserted between contiguous input strings.

Read syntax diagramSkip visual syntax diagramLISTAGG(ALLDISTINCT string-expression,separator)WITHIN GROUP(ORDER BY,sort-keyASCDESC)

The schema is SYSIBM.

The LISTAGG function aggregates a set of string values for a group into one string by appending the string-expression values based on the order that is specified in the WITHIN GROUP clause.

The function is applied to the set of values that are derived from the first argument by the elimination of null values. If a separator argument is specified that is not the null value, the separator value is inserted between each pair of non-null string-expression values.

string-expression
An expression that specifies the string values to aggregate. The expression must return a value that is a built-in character string, graphic string, or a binary string data type that is not a LOB. If the value is a CLOB, it is implicitly cast to VARCHAR before the function is evaluated up to a maximum length of 32704. If the value is a DBCLOB, it is implicitly cast to VARGRAPHIC before the function is evaluated up to a maximum length of 16352. If the value is a BLOB, it is implicitly cast to VARBINARY before the function is evaluated up to a maximum length of 32704.

The argument can also be a numeric data type. The numeric argument is implicitly cast to a VARCHAR data type.

If string-expression is a column encoded by a field procedure, the decoded value is used in the result.

separator
A constant expression that defines the string that is to be used between non-null string-expression values. The expression must return a value that is a built-in character string, graphic string, or a binary string data type.

LOBs are supported through implicit casting. If the value is a CLOB, it is implicitly cast to VARCHAR before the function is evaluated up to a maximum length of 32704. If the value is a DBCLOB, it is implicitly cast to VARGRAPHIC before the function is evaluated up to a maximum length of 16352. If the value is a BLOB, it is implicitly cast to VARBINARY before the function is evaluated up to a maximum length of 32704.

The separator can be a literal, special register, variable, or an expression that is based on literals, special registers, or variables, provided that the expression does not include a non-deterministic function or a function that takes an external action.

WITHIN GROUP
Indicates that the aggregation follows the specified ordering within the grouping set.

If WITHIN GROUP is not specified, the ordering of strings within the result is not deterministic.

ORDER BY
Start of changeSpecifies the order of the rows from the same grouping set that are processed in the aggregation. If the ORDER BY clause cannot distinguish the order of the column data, the rows in the same grouping set are arbitrarily ordered.
sort-key
The sort key can be a column name or a sort-key-expression. If the sort key is a constant, it does not refer to the position of the output column (as in the ORDER BY clause of a query); It is a constant, which implies no sort key.

If the sort key value is a constant, the constant does not refer to the position of the output column, but is simply a constant, which implies that there is no sort key.

sort-key must not include a scalar fullselect, or any function that is non-deterministic or has an external action.

sort-key must not reference a column for which a column mask is defined. Start of changeFL 506 This restriction is removed in application compatibility level V13R1M506 or higher, and the masked values are used to evaluate the expression.End of change

If sort-key is encoded by a field procedure, the encoded value determines the order.

ASC
Processes the sort-key in ascending order. This is the default option.
DESC
Processes the sort-key in descending order.
End of change

Results for LISTAGG

The result data type of LISTAGG is based on the data type of string-expression, as shown in the following table:

Data type of string-expression Result data type and length
CHAR(n) or VARCHAR(n) VARCHAR(MIN(MAX(4000,n), 32704)
GRAPHIC(n) or VARGRAPHIC(n) VARGRAPHIC(MIN(MAX(2000,n), 16352)
BINARY(n) or VARBINARY(n) VARBINARY(MIN(MAX(4000,n), 32704)

The result data type can exceed VARCHAR(4000), VARBINARY(4000), or VARGRAPHIC(2000), if a derived size is used to determine the size of the result. The maximum possible value is the maximum for the result data type. The following example yields a return data type of VARCHAR(10000):

LISTAGG(CAST(NAME AS VARCHAR(10000)),',')

If the actual length of the aggregated result string exceeds the maximum for the result data type, an error is returned.

The result can be null. If the function is applied to an empty set, or all of the string-expression values in the set are null values, the result is a null value.

The following rules also apply:

  • If the string-expression and separator are both bit data, the result is bit data. Otherwise, neither string-expression nor separator can be bit data.
  • If string-expression and separator are both SBCS Unicode data, the CCSID of the result is the CCSID for SBCS Unicode data.
  • If string-expression is SBCS Unicode data, and separator is not SBCS Unicode data, the CCSID of the result is the mixed CCSID for Unicode data.
  • Otherwise, the CCSID of the result is the mixed CCSID that corresponds to the CCSID of string-expression. However, if the input is EBCDIC or ASCII and there is no corresponding system CCSID for mixed, the CCSID of the result is the CCSID of string-expression.
Results for accelerators: When multiple tables are joined, the result can differ when LISTAGG runs on an accelerator instead of Db2 for z/OS®. To guarantee the same result from both environments, you can use an ORDER BY clause in a unique sort-key expression. This approach ensures that the result set is the always the same. However, the order of rows in the result can still differ, unless the statement also specifies a unique GROUP BY clause for the sort-key expression. Start of changeFL 504 The ORDER BY clause can be used to specify the order.End of change

Rules for LISTAGG

The following restrictions apply to LISTAGG:

  • LISTAGG cannot be used as part of an OLAP specification.
  • If DISTINCT is specified for LISTAGG, the following rules apply:
    • The ORDER BY clause must be specified.
    • The first sort-key of the ORDER BY specification must exactly match string-expression.
    • Start of changeFL 506 The string-expression must not reference a column that has a column mask.End of change
  • Start of changeFL 504 A fullselect that contains an invocation of LISTAGG can contain an ORDER BY clause.End of change
  • A fullselect that contains an invocation of LISTAGG cannot contain a DISTINCT keyword in its SELECT list.
  • A SELECT clause that includes an invocation of the LISTAGG function must not also include an invocation of the ARRAY_AGG function or the XMLAGG function.

Examples for LISTAGG

  • The following example produces an alphabetical list of comma-separated names, grouped by department, from the sample employee table.
    SELECT WORKDEPT,
     LISTAGG(LASTNAME, ', ') WITHIN GROUP(ORDER BY LASTNAME)
     AS EMPLOYEES
     FROM EMP
     GROUP BY WORKDEPT;

    The following result is returned.

    ---------+---------+---------+---------+---------+---------+---------+---------+---------+-------
    WORKDEPT  EMPLOYEES                                                             
    ---------+---------+---------+---------+---------+---------+---------+---------+---------+-------
    A00       HAAS, HEMMINGER, LUCCHESI, O'CONNELL, ORLANDO                         
    B01       THOMPSON                                                              
    C01       KWAN, NATZ, NICHOLLS, QUINTANA                                        
    D11       ADAMSON, BROWN, JOHN, JONES, LUTZ, PIANKA, SCOUTTEN, STERN, WALKER, YAMAMOTO, YOSHIMURA
    D21       JEFFERSON, JOHNSON, MARINO, MONTEVERDE, PEREZ, PULASKI, SMITH         
    E01       GEYER                                                                 
    E11       HENDERSON, PARKER, SCHNEIDER, SCHWARTZ, SETRIGHT, SMITH, SPRINGER     
    E21       ALONZO, GOUNOT, LEE, MEHTA, SPENSER, WONG                             
    
  • Start of changeFL 506 The following example shows how the result returned by the LISTAGG function can change if the sort-key expression references a column for which a column mask is defined. For this example, assume that an EMPLOYEE table is created with the following statement.
    CREATE TABLE EMPLOYEE ( STATE CHAR(2) NOT NULL,
                            SALARY INTEGER,             
                            BONUS  INTEGER,             
                            NAME  VARCHAR(20) NOT NULL); 
    

    Also assume that a column mask is defined on the SALARY column with the following CREATE MASK statement.

    CREATE MASK SALARY_MASK ON EMPLOYEE
        FOR COLUMN SALARY
            RETURN
                CASE
                    WHEN (VERIFY_GROUP_FOR_USER(SESSION_USER, 'MANAGER') = 1) THEN
                        SALARY
                    ELSE  0
                END
        ENABLE;
    

    Also assume that the following query returns the following result when a manager issues it.

    SELECT STATE,                            
                 CASE WHEN(SALARY >50000) THEN NAME
                          ELSE 'XXXX'                  
                      END AS NAME,                      
                  SALARY                            
          FROM EMPLOYEE; 
    
    
           +---------------------------------------------------------+ 
           | STATE |     NAME    |         SALARY                    | 
           +---------------------------------------------------------+ 
           | CA    |     Jerry   |         80000                     | 
           | CA    |     Tom     |         70000                     | 
           | LU    |     XXXX    |         40000                     | 
           | LU    |     XXXX    |         10000                     | 
           | LU    |     XXXX    |         50000                     | 
           | MA    |     Sarah   |         90000                     | 
           + --------------------------------------------------------+ 
    

    Now consider the following LISTAGG statement.

    SELECT STATE,
               LISTAGG (NAME,',') WITHIN GROUP (ORDER BY SALARY ASC)
        FROM EMPLOYEE
        GROUP BY STATE;
    

    If a manager issues the example SELECT statement, it returns the following result. The column mask did not apply, so the names in each row are ordered based on the unmasked SALARY values.

    
        +----------------------------------------+ 
        | STATE |                                | 
        +----------------------------------------| 
        | CA    | Tom,Jerry                      | 
        | LU    | Ben,Linda,Mary                 | 
        | MA    | Sarah                          |
        +----------------------------------------+  
    

    However, if a non-manager employee issues the example SELECT statement, the column mask applies before the sort-key expression is evaluated. The result is that the masked SALARY value 0 is used for every employee when the ORDER BY is evaluated, so the names in each row are returned in a different order.

    
        +----------------------------------------+ 
        | STATE    |                             | 
        +----------------------------------------| 
        | CA       | Jerry,Tom                   | 
        | LU       | Linda,Ben,Mary              | 
        | MA       | Sarah                       | 
        +----------------------------------------+  
    
    End of change