Overview (SPLIT FILE command)
SPLIT FILE
splits the active dataset into subgroups that can be analyzed separately.
These subgroups are sets of adjacent cases in the file that have the
same values for the specified split variables. Each value of each
split variable is considered a break group, and cases within a break
group must be grouped together in the active dataset. If they are
not grouped together, the SORT CASES
command must be used before SPLIT FILE
to sort cases in the proper order.
Basic Specification
The basic
specification is keyword BY
followed
by the variable or variables that define the split-file groups.
- By default, the split-file groups are compared within the same table(s).
- You can turn off split-file
processing by using keyword
OFF
.
Syntax Rules
-
SPLIT FILE
can specify both numeric and string split variables, including variables that are created by temporary transformations.SPLIT FILE
cannot specify scratch or system variables. -
SPLIT FILE
is in effect for all procedures in a session unless you limit it with aTEMPORARY
command, turn it off, or override it with a newSPLIT FILE
orSORT CASES
command.
Operations
-
SPLIT FILE
takes effect as soon as it is encountered in the command sequence. Therefore, pay special attention to the position ofSPLIT FILE
among commands. See the topic Command Order for more information. - The file is processed sequentially. A change or break in values on any one of the split variables signals the end of one break group and the beginning of the next break group.
-
AGGREGATE
ignores theSPLIT FILE
command. To split files by usingAGGREGATE
, name the variables that are used to split the file as break variables ahead of any other break variables onAGGREGATE
.AGGREGATE
still produces one file, but the aggregated cases are in the same order as the split-file groups. - If
SPLIT FILE
is in effect when a procedure writes matrix materials, the program writes one set of matrix materials for every split group. If a procedure reads a file that contains multiple sets of matrix materials, the procedure automatically detects the presence of multiple sets. - If
SPLIT FILE
names any variable that was defined by theNUMERIC
command, the program prints page headings that indicate the split-file grouping.
Limitations
-
SPLIT FILE
can specify or imply up to eight variables. - Each eight bytes of a string variable counts as a variable toward the limit of eight variables. So a string variable with a defined width of greater than 64 bytes cannot be used as a split file variable.