Overview (SUMMARIZE command)
SUMMARIZE
produces univariate statistics for specified variables. You can
break the variables into groups defined by one or more control (independent)
variables. Another procedure that displays univariate statistics is FREQUENCIES
.
Options
Cell Contents. By default, SUMMARIZE
displays means, standard deviations,
and cell counts. You can also use the CELLS
subcommand to display aggregated statistics, including sums, variances,
median, range, kurtosis, skewness, and their standard error.
Statistics. In addition to the statistics that are displayed for each cell
of the table, you can use the STATISTICS
subcommand to obtain a one-way analysis of variance and test of
linearity.
Format. By default, SUMMARIZE
produces a Summary Report with a total category for each group that
is defined by the control variables. You can use the FORMAT
subcommand to request a Listing Report
with or without case numbers. You can also remove the total category
from each group. You can use the TITLE
and FOOTNOTE
subcommands to
specify a title and a caption for the Summary or Listing Report.
Basic Specification
The basic
specification is TABLES
with
a variable list. Each variable creates a category. The actual keyword TABLES
can be omitted.
- The minimum specification is a dependent variable.
- By default,
SUMMARIZE
displays a Case Processing Summary table, showing the number and percentage of cases included, excluded, and their total.SUMMARIZE
also displays a Summary Report, showing means, standard deviations, and number of cases for each category.
Syntax Rules
- Both numeric and string variables can be specified. String variables can be short or long.
- If there is more than one
TABLES
subcommand,FORMAT=LIST
orVALIDLIST
results in an error. - String specifications
for
TITLE
andFOOTNOTE
must be enclosed in quotes and cannot exceed 255 bytes. When the specification breaks on multiple lines, enclose each line in quotes, and separate the specifications for each line by at least one blank. - Each subcommand
(except
TABLES
) can be specified only once. If a subcommand is specified multiple times, a warning results, and the last specification is used. - Multiple
TABLES
subcommands are allowed, but multiple specifications use a lot of computer resources and time. - There is no limit on the number of
variables that you can specify on each
TABLES
subcommand. - When a variable is specified more
than once, only the first occurrence is honored. If the same variables
are specified after different
BY
keywords, an error results.
Limitations
- Only 10
BY
keywords can be specified.
Operations
- The data
are processed sequentially. It is not necessary to sort the cases
before processing. If a
BY
keyword is used, the output is always sorted. - A Case Processing Summary table is always generated, showing the number and percentage of the cases included, excluded, and the total.
- For each combination of control variables specified after different
BY
keywords,SUMMARIZE
produces a group in the Summary Report (depending on the specification on theFORMAT
subcommand). By default, mean, standard deviation, and number of cases are displayed for each group and for the total. - An ANOVA table and a Measure of Association table are produced if additional statistics are requested.