Overview (FILE TYPE-END FILE TYPE command)
The FILE TYPE-END FILE
TYPE
structure defines data for any one of the three types
of complex raw data files: mixed files, which contain several types of records that define different types
of cases; hierarchical or nested files, which contain several
types of records with a defined relationship among the record types;
or grouped files, which
contain several records for each case with some records missing or
duplicated. A fourth type of complex file, files with repeating groups of information, can
be defined with the REPEATING DATA
command.
FILE TYPE
must
be followed by at least one RECORD TYPE
command and one DATA LIST
command.
Each pair of RECORD TYPE
and DATA LIST
commands defines one type of record
in the data. END FILE TYPE
signals
the end of file definition.
Within the FILE TYPE
structure, the lowest-level record in a nested file can be read
with a REPEATING DATA
command
rather than a DATA LIST
command.
In addition, any record in a mixed file can be read with REPEATING DATA
.
Basic Specification
The
basic specification on FILE TYPE
is one of the three file type keywords (MIXED
, GROUPED
,
or NESTED
) and the RECORD
subcommand. RECORD
names the record identification variable and specifies
its column location. If keyword GROUPED
is specified, the CASE
subcommand
is also required. CASE
names
the case identification variable and specifies its column location.
The FILE TYPE-END FILE TYPE
structure must enclose at least one RECORD
TYPE
and one DATA LIST
command. END FILE TYPE
is required
to signal the end of file definition.
-
RECORD TYPE
specifies the values of the record type identifier (seeRECORD TYPE
). -
DATA LIST
defines variables for the record type specified on the precedingRECORD TYPE
command (seeDATA LIST
). - Separate
pairs of
RECORD TYPE
andDATA LIST
commands must be used to define each different record type.
The resulting active dataset is always a rectangular file, regardless of the structure of the original data file.
Syntax Rules
- For mixed
files, if the record types have different variables or if they have
the same variables recorded in different locations, separate
RECORD TYPE
andDATA LIST
commands are required for each record type. - For mixed files, the same variable name can be
used on different
DATA LIST
commands, since each record type defines a separate case. - For mixed files, if the same variable is defined for more than one
record type, the format type and length of the variable should be
the same on all
DATA LIST
commands. The program refers to the firstDATA LIST
command that defines a variable for the print and write formats to include in the dictionary of the active dataset. - For grouped and nested
files, the variable names on each
DATA LIST
must be unique, since a case is built by combining all record types together into a single record. - For nested files,
the order of the
RECORD TYPE
commands defines the hierarchical structure of the file. The firstRECORD TYPE
defines the highest-level record type, the nextRECORD TYPE
defines the next highest-level record, and so forth. The lastRECORD TYPE
command defines a case in the active dataset. By default, variables from higher-level records are spread to the lowest-level record. - For nested
files, the
SPREAD
subcommand onRECORD TYPE
can be used to spread the values in a record type only to the first case built from each record of that type. All other cases associated with that record are assigned the system-missing value for the variables defined on that type. SeeRECORD TYPE
for more information. - String values specified on the
RECORD TYPE
command must be enclosed in quotes.
Operations
- For mixed
file types, the program skips all records that are not specified on
one of the
RECORD TYPE
commands. - If different variables are defined for different record types in mixed files, the variables are assigned the system-missing value for those record types on which they are not defined.
- For nested files, the first record in the file should
be the type specified on the first
RECORD TYPE
command—the highest level of the hierarchy. If the first record in the file is not the highest-level type, the program skips all records until it encounters a record of the highest-level type. IfMISSING
orDUPLICATE
has been specified, these records may produce warning messages but will not be used to build a case in the active dataset. - When defining complex files, you are effectively building an input program and can use only commands that are allowed in the input state.