Overview (SUMMARIZE command)

SUMMARIZE produces univariate statistics for specified variables. You can break the variables into groups defined by one or more control (independent) variables. Another procedure that displays univariate statistics is FREQUENCIES.

Options

Cell Contents. By default, SUMMARIZE displays means, standard deviations, and cell counts. You can also use the CELLS subcommand to display aggregated statistics, including sums, variances, median, range, kurtosis, skewness, and their standard error.

Statistics. In addition to the statistics that are displayed for each cell of the table, you can use the STATISTICS subcommand to obtain a one-way analysis of variance and test of linearity.

Format. By default, SUMMARIZE produces a Summary Report with a total category for each group that is defined by the control variables. You can use the FORMAT subcommand to request a Listing Report with or without case numbers. You can also remove the total category from each group. You can use the TITLE and FOOTNOTE subcommands to specify a title and a caption for the Summary or Listing Report.

Basic Specification

The basic specification is TABLES with a variable list. Each variable creates a category. The actual keyword TABLES can be omitted.

  • The minimum specification is a dependent variable.
  • By default, SUMMARIZE displays a Case Processing Summary table, showing the number and percentage of cases included, excluded, and their total. SUMMARIZE also displays a Summary Report, showing means, standard deviations, and number of cases for each category.

Syntax Rules

  • Both numeric and string variables can be specified. String variables can be short or long.
  • If there is more than one TABLES subcommand, FORMAT=LIST or VALIDLIST results in an error.
  • String specifications for TITLE and FOOTNOTE must be enclosed in quotes and cannot exceed 255 bytes. When the specification breaks on multiple lines, enclose each line in quotes, and separate the specifications for each line by at least one blank.
  • Each subcommand (except TABLES) can be specified only once. If a subcommand is specified multiple times, a warning results, and the last specification is used.
  • Multiple TABLES subcommands are allowed, but multiple specifications use a lot of computer resources and time.
  • There is no limit on the number of variables that you can specify on each TABLES subcommand.
  • When a variable is specified more than once, only the first occurrence is honored. If the same variables are specified after different BY keywords, an error results.

Limitations

  • Only 10 BY keywords can be specified.

Operations

  • The data are processed sequentially. It is not necessary to sort the cases before processing. If a BY keyword is used, the output is always sorted.
  • A Case Processing Summary table is always generated, showing the number and percentage of the cases included, excluded, and the total.
  • For each combination of control variables specified after different BY keywords, SUMMARIZE produces a group in the Summary Report (depending on the specification on the FORMAT subcommand). By default, mean, standard deviation, and number of cases are displayed for each group and for the total.
  • An ANOVA table and a Measure of Association table are produced if additional statistics are requested.