Including Missing Values (AGGREGATE command)

You can force a function to include user-missing values in its calculations by specifying a period after the function name.

  • AGGREGATE ignores periods used with the functions N, NU, NMISS, and NUMISS if these functions have no arguments.
  • User-missing values are treated as valid when these four functions are followed by a period and have a variable as an argument. NMISS.(AGE) treats user-missing values as valid and thus gives the number of cases for which AGE has the system-missing value only.

The effect of specifying a period with N, NU, NMISS, and NUMISS is illustrated by the following:

N = N. = N(AGE) + NMISS(AGE) = N.(AGE) + NMISS.(AGE)

NU = NU. = NU(AGE) + NUMISS(AGE) = NU.(AGE) + NUMISS.(AGE)
  • The function N (the same as N. with no argument) yields a value for each break group that equals the number of cases with valid values (N(AGE)) plus the number of cases with user- or system-missing values (NMISS(AGE)).
  • This in turn equals the number of cases with either valid or user-missing values (N.(AGE)) plus the number with system-missing values (NMISS.(AGE)).
  • The same identities hold for the NU, NMISS, and NUMISS functions.

Default Treatment of Missing Values

AGGREGATE OUTFILE=’AGGEMP.SAV’ /MISSING=COLUMNWISE /BREAK=LOCATN
 /AVGSAL = MEAN(SALARY).
  • AVGSAL is missing for an aggregated case if SALARY is missing for any case in the break group.

Including User-Missing Values

AGGREGATE OUTFILE=* /BREAK=DEPT
 /LOVAC = PLT.(VACDAY,10).
  • LOVAC is the percentage of cases within each break group with values less than 10 for VACDAY, even if some of those values are defined as user missing.

Aggregated Values that Retain Missing-Value Status

AGGREGATE OUTFILE=’CLASS.SAV’ /BREAK=GRADE
  /FIRSTAGE = FIRST.(AGE).
  • The first value of AGE in each break group is assigned to the variable FIRSTAGE.
  • If the first value of AGE in a break group is user missing, that value will be assigned to FIRSTAGE. However, the value will retain its missing-value status, since variables created with FIRST take dictionary information from their source variables.