Aggregating Data

You can use a UniVerse stage to aggregate data at the source instead of using an intermediate Aggregator stage if you are using a generated query.

By aggregating data you can add values in a particular column for all data records in a table. This summed total is then output from the stage.

You can aggregate data in two ways:

  • Using an Aggregator stage.
  • Using a UniVerse stage.

If you aggregate data using a UniVerse stage, the columns to group by and sum together are also specified by the SQL SELECT statement. To specify the columns to group by and summarize, you must edit the column definitions in the Columns grid on the Columns tab.

For example, if you have a sales database (Sales1) it might contain the following columns: Product, SaleDate, and QtySold. If this database is updated daily, you have a record of how many of each product are sold each day. However, if you want to know how many of each product were sold since 01/01/96 you need to specify a WHERE clause for the SaleDate and group (and summarize) the data.

Because you want the total for each product, you need to group all the occurrences of the same value in the Product column and sum the value in the QtySold column.

To group by a column, click in the Group cell for the column definition you want to group by and choose Yes from the list. In the example, you would choose the Product column to edit.

To summarize a column, click in the Derivation cell for the column you want to aggregate (using SUM or COUNT). By default the Derivation cell contains the name of the table and column in the format tablename.columnname. You can edit this cell to add SUM or COUNT. In the example, you would edit the Derivation cell for the QtySold column. The resulting expression would be SUM(Sales1.QtySold).

You can use the Expression Substitution dialog box to edit multiple Derivation cells at the same time. Select the columns and choose Derivation Substitution from the shortcut menu.

When you group by or summarize columns, the SQL statement is automatically updated to include the GROUP BY clause and the aggregation expression. To view the SQL statement, click the View SQL tab on the Outputs page.

For example, the SQL statement for the example would be:

SELECT Product, SUM(QtySold) FROM Sales1
   WHERE Saledate>=01/01/96
   GROUP BY Product;

See Using a WHERE Clause for details of how to use a WHERE clause.