CASE Subcommand (FILE TYPE-END FILE TYPE command)

CASE specifies a name and column location for the case identification variable. CASE is required for grouped files and optional for nested files. It cannot be used with mixed files.

  • For grouped files, each unique value for the case identification variable defines a case in the active dataset.
  • For nested files, the case identification variable identifies the highest-level record of the hierarchy. The program issues a warning message for each record with a case identification number not equal to the case identification number on the last highest-level record. However, the record with the invalid case number is used to build the case.
  • The column location of the case identifier is required. The variable name is optional.
  • If you do not want to save the case identification variable, you can assign a scratch variable name by using the # character as the first character of the name. If a variable name is not specified on CASE, the case identifier is defined as the scratch variable ####CASE.
  • A column-style format can be specified for the case identifier. For example, the following two specifications are valid:
CASE=V1 1-2(N)
CASE=V1 1-2(F,1)

FORTRAN-like formats cannot be used because the column location must be specified explicitly.

  • Specify A in parentheses after the column location to define the case identification variable as a string variable.
  • If the case identification number is not in the same columns on all record types, use the CASE subcommand on the RECORD TYPE commands as well as on the FILE TYPE command (see RECORD TYPE).

Example

* A grouped file of student test scores.
  
FILE TYPE GROUPED RECORD=#TEST 6 CASE=STUDENT 1-4.
RECORD TYPE 1.
DATA LIST  /ENGLISH 8-9 (A).
RECORD TYPE 2.
DATA LIST /READING 8-10.
RECORD TYPE 3.
DATA LIST /MATH 8-10.
END FILE TYPE.
 
BEGIN DATA
0001 1 B+
0001 2  74
0001 3  83
0002 1 A
0002 2 100
0002 3  71
0003 1 B-
0003 2  88
0003 3  81
0004 1 C
0004 2  94
0004 3  91
END DATA.
  • CASE is required for grouped files. CASE specifies variable STUDENT, located in columns 1–4, as the case identification variable.
  • The data contain four different values for STUDENT. The active dataset therefore has four cases, one for each value of STUDENT. In a grouped file, each unique value for the case identification variable defines a case in the active dataset.
  • Each case includes the case identification variable plus the variables defined for each record type. The values for #TEST are not saved in the active dataset. Thus, each case in the active dataset has four variables: STUDENT, ENGLISH, READING, and MATH.

Example

* A nested file of accident records.
 
FILE TYPE NESTED RECORD=6 CASE=ACCID 1-4.
RECORD TYPE 1.
DATA LIST   /ACC_ID 9-11 WEATHER 12-13 STATE 15-16 (A) DATE 18-24 (A).
RECORD TYPE 2.
DATA LIST /STYLE 11 MAKE 13 OLD 14 LICENSE 15-16 (A) INSURNCE 18-21 (A).
RECORD TYPE 3.
DATA LIST /PSNGR_NO 11 AGE 13-14 SEX 16 (A) INJURY 18 SEAT 20-21 (A)
           COST 23-24.
END FILE TYPE.
 
BEGIN DATA
0001 1  322 1 IL 3/13/88   /* Type 1:  accident record
0001 2    1 44MI 134M      /* Type 2:    vehicle record
0001 3    1 34 M 1 FR  3   /* Type 3:       person record
0001 2    2 16IL 322F      /*            vehicle record
0001 3    1 22 F 1 FR 11   /*               person record
0001 3    2 35 M 1 FR  5   /*               person record
0001 3    3 59 M 1 BK  7   /*               person record
0001 2    3 21IN 146M      /*            vehicle record
0001 3    1 46 M 0 FR  0   /*               person record
END DATA.
  • CASE specifies variable ACCID, located in columns 1–4, as the case identification variable. ACCID identifies the highest level of the hierarchy: the level for the accident records.
  • As each case is built, the value of the variable ACCID is checked against the value of ACCID on the last highest-level record (record type 1). If the values do not match, a warning message is issued. However, the record is used to build the case.
  • The data in this example contain only one value for ACCID, which is spread across all cases. In a nested file, the lowest-level record type determines the number of cases in the active dataset. In this example, the active dataset has five cases because there are five person records.

Example

* Specifying case on the RECORD TYPE command.
 
FILE TYPE GROUPED FILE=HUBDATA RECORD=#RECID 80 CASE=ID 1-5.
RECORD TYPE 1.
DATA LIST   /MOHIRED YRHIRED 12-15 DEPT79 TO DEPT82 SEX 16-20.
RECORD TYPE 2.
DATA LIST   /SALARY79 TO SALARY82 6-25 HOURLY81 HOURLY82 40-53 (2)
            PROMO81 72  AGE 54-55 RAISE82 66-70.
RECORD TYPE 3  CASE=75-79.
DATA LIST   /JOBCAT 6 NAME 25-48 (A).
END FILE TYPE.
  • The CASE subcommand on FILE TYPE indicates that the case identification number is located in columns 1–5. However, for type 3 records, the case identification number is located in columns 75–79. The CASE subcommand is therefore specified on the third RECORD TYPE command to override the case setting for type 3 records.
  • The format of the case identification variable must be the same on all records. If the case identification variable is defined as a string on the FILE TYPE command, it cannot be defined as a numeric variable on the RECORD TYPE command, and vice versa.