Creating a new aggregated data file versus appending aggregated variables (AGGREGATE command)
When you create a new aggregated data file with OUTFILE=’file specification’
or OUTFILE=* MODE=REPLACE
,
the new file contains:
- The break variables from the original data file and the new aggregate variables defined by the aggregate functions. Original variables other than the break variables are not retained.
- One case for each group defined by the break variables. If there is one break variable with two values, the new data file will contain only two cases.
When you append aggregate variables to the active
dataset with OUTFILE=* MODE=ADDVARIABLES
, the modified data file contains:
- All of the original variables plus all of the new variables defined by the aggregate functions, with the aggregate variables appended to the end of each case.
- The same number of cases as the original data file. The data file itself is not aggregated. Each case with the same values of the break variables receives the same values for the new aggregate variables. For example, if gender is the only break variable, all males would receive the same value for a new aggregate variable that represents the average age.
Example
DATA LIST FREE /age (F2) gender (F2).
BEGIN DATA
25 1
35 1
20 2
30 2
60 2
END DATA.
*create new file with aggregated results.
AGGREGATE
/OUTFILE='/temp/temp.sav'
/BREAK=gender
/age_mean=MEAN(age)
/groupSize=N.
*append aggregated variables to active dataset.
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=gender
/age_mean=MEAN(age)
/groupSize=N.

