Overview (SPLIT FILE command)
SPLIT FILE splits the active dataset into subgroups that can be analyzed separately.
These subgroups are sets of adjacent cases in the file that have the
same values for the specified split variables. Each value of each
split variable is considered a break group, and cases within a break
group must be grouped together in the active dataset. If they are
not grouped together, the SORT CASES command must be used before SPLIT FILE to sort cases in the proper order.
Basic Specification
The basic
specification is keyword BY followed
by the variable or variables that define the split-file groups.
- By default, the split-file groups are compared within the same table(s).
- You can turn off split-file
processing by using keyword
OFF.
Syntax Rules
-
SPLIT FILEcan specify both numeric and string split variables, including variables that are created by temporary transformations.SPLIT FILEcannot specify scratch or system variables. -
SPLIT FILEis in effect for all procedures in a session unless you limit it with aTEMPORARYcommand, turn it off, or override it with a newSPLIT FILEorSORT CASEScommand.
Operations
-
SPLIT FILEtakes effect as soon as it is encountered in the command sequence. Therefore, pay special attention to the position ofSPLIT FILEamong commands. See the topic Command Order for more information. - The file is processed sequentially. A change or break in values on any one of the split variables signals the end of one break group and the beginning of the next break group.
-
AGGREGATEignores theSPLIT FILEcommand. To split files by usingAGGREGATE, name the variables that are used to split the file as break variables ahead of any other break variables onAGGREGATE.AGGREGATEstill produces one file, but the aggregated cases are in the same order as the split-file groups. - If
SPLIT FILEis in effect when a procedure writes matrix materials, the program writes one set of matrix materials for every split group. If a procedure reads a file that contains multiple sets of matrix materials, the procedure automatically detects the presence of multiple sets. - If
SPLIT FILEnames any variable that was defined by theNUMERICcommand, the program prints page headings that indicate the split-file grouping.
Limitations
-
SPLIT FILEcan specify or imply up to eight variables. - Each eight bytes of a string variable counts as a variable toward the limit of eight variables. So a string variable with a defined width of greater than 64 bytes cannot be used as a split file variable.