Overview (ADD FILES command)
ADD FILES
combines cases from 2 up to 50 open data sets or external IBM® SPSS® Statistics data files by concatenating or interleaving
cases. When cases are concatenated, all cases from one file are added to the end of all cases from
another file. When cases are interleaved, cases in the resulting file are ordered according to the values
of one or more key variables.
The files specified on ADD
FILES
can be external IBM SPSS Statistics data files and/or currently open
datasets. The combined file becomes the new active dataset.
In general, ADD FILES
is used to combine files containing the same variables but different
cases. To combine files containing the same cases but different variables,
use MATCH FILES
. To update existing IBM SPSS Statistics data files, use UPDATE
.
Options
Variable Selection. You can specify which
variables from each input file are included in the new active dataset
using the DROP
and KEEP
subcommands.
Variable Names. You can
rename variables in each input file before combining the files using
the RENAME
subcommand. This permits
you to combine variables that are the same but whose names differ
in different input files or to separate variables that are different
but have the same name.
Variable Flag. You can create a variable
that indicates whether a case came from a particular input file using IN
. When interleaving cases, you can use
the FIRST
or LAST
subcommands to create a variable that
flags the first or last case of a group of cases with the same value
for the key variable.
Variable Map. You can request a map showing
all variables in the new active dataset, their order, and the input
files from which they came using the MAP
subcommand.
Basic Specification
- The basic
specification is two or more
FILE
subcommands, each of which specifies a file to be combined. If cases are to be interleaved, theBY
subcommand specifying the key variables is also required. - All variables from all input files are included in the
new active dataset unless
DROP
orKEEP
is specified.
Subcommand Order
-
RENAME
andIN
must immediately follow theFILE
subcommand to which they apply. -
BY
,FIRST
, andLAST
must follow allFILE
subcommands and their associatedRENAME
andIN
subcommands.
Syntax Rules
-
RENAME
can be repeated after eachFILE
subcommand.RENAME
applies only to variables in the file named on theFILE
subcommand immediately preceding it. -
BY
can be specified only once. However, multiple key variables can be specified onBY
. WhenBY
is used, all files must be sorted in ascending order by the key variables (seeSORT CASES
). -
FIRST
andLAST
can be used only when files are interleaved (whenBY
is used). -
MAP
can be repeated as often as desired.
Operations
-
ADD FILES
reads all input files named onFILE
and builds a new active dataset.ADD FILES
is executed when the data are read by one of the procedure commands or theEXECUTE
,SAVE
, orSORT CASES
commands.- If the current active dataset is included
and is specified with an asterisk (
FILE=*
), the new merged dataset replaces the active dataset. If that dataset is a named dataset, the merged dataset retains that name. If the current active dataset is not included or is specified by name (for example,FILE=Dataset1
), a new unnamed, merged dataset is created, and it becomes the active dataset. For information on naming datasets, see DATASET NAME.
- If the current active dataset is included
and is specified with an asterisk (
- The resulting file contains complete dictionary
information from the input files, including variable names, labels,
print and write formats, and missing-value indicators. It also contains
the documents from each input file. See
DROP DOCUMENTS
for information on deleting documents. - For each variable, dictionary information is taken from
the first file containing value labels, missing values, or a variable
label for the common variable. If the first file has no such information,
ADD FILES
checks the second file, and so on, seeking dictionary information. - Variables are copied in order from the first file specified, then from the second file specified, and so on. Variables that are not contained in all files receive the system-missing value for cases that do not have values for those variables.
- If the same variable name exists in more than one file but the format type (numeric or string) does not match, the command is not executed.
- If a numeric variable has the same name but different
formats (for example,
F8.0
andF8.2
) in different input files, the format of the variable in the first-named file is used. - If a string variable has the same name but different formats (for
example,
A24
andA16
) in different input files, the command is not executed. - If the active dataset is named
as an input file, any
N
andSAMPLE
commands that have been specified are applied to the active dataset before the files are combined. - If only one of the files is weighted, the program
turns weighting off when combining cases from the two files. To weight
the cases, use the
WEIGHT
command again.
Limitations
- A maximum
of 50 files can be combined on one
ADD FILES
command. - The
TEMPORARY
command cannot be in effect if the active dataset is used as an input file.