Overview (AGGREGATE command)
AGGREGATE
aggregates groups of cases in the active dataset into single cases
and creates a new aggregated file or creates new variables in the
active dataset that contain aggregated data. The values of one or
more variables in the active dataset define the case groups. These
variables are called break variables. A set of cases with identical values for each break variable is
called a break group. If
no break variables are specified, then the entire dataset is a single
break group. Aggregate functions are applied to source variables in the active dataset
to create new aggregated variables that have one value for each break
group.
Options
Data. You can create new variables in the active dataset that contain aggregated data, replace the active dataset with aggregated results, or create a new data file that contains the aggregated results.
Documentary Text. You can copy documentary
text from the original file into the aggregated file using the DOCUMENT
subcommand. By default, documentary
text is dropped.
Aggregated Variables. You can create aggregated variables
using any of 19 aggregate functions. The functions SUM
, MEAN
, and SD
can aggregate only
numeric variables. All other functions can use both numeric and string
variables.
Labels and Formats. You can specify variable labels for
the aggregated variables. Variables created with the functions MAX
, MIN
, FIRST
, and LAST
assume the formats and value labels
of their source variables.
All other variables assume
the default formats described under Aggregate Functions.
Basic Specification
The basic specification is at least one aggregate function and source variable. The aggregate function creates a new aggregated variable in the active dataset.
Subcommand Order
- If specified,
OUTFILE
must be specified first. - If specified,
DOCUMENT
andPRESORTED
must precedeBREAK
. No other subcommand can be specified between these two subcommands. -
MISSING
, if specified, must immediately followOUTFILE
. - The aggregate functions must be specified last.
Operations
- When replacing the active dataset or creating a new data file, the aggregated file contains the break variables plus the variables created by the aggregate functions.
-
AGGREGATE
excludes cases with missing values from all aggregate calculations except those involving the functionsN
,NU
,NMISS
, andNUMISS
. - Unless otherwise specified,
AGGREGATE
sorts cases in the aggregated file in ascending order of the values of the grouping variables. -
PRESORTED
uses a faster, less memory-intensive algorithm that assumes the data are already sorted into the desired groups. -
AGGREGATE
ignores split-file processing. To achieve the same effect, name the variable or variables used to split the file as break variables before any other break variables.AGGREGATE
produces one file, but the aggregated cases will then be in the same order as the split files.