Overview (CASESTOVARS command)

A variable contains information that you want to analyze, such as a measurement or a test score. A case is an observation, such as an individual or an institution.

In a simple data file, each variable is a single column in your data, and each case is a single row in your data. So, if you were recording the score on a test for all students in a class, the scores would appear in only one column and there would be only one row for each student.

Complex data files store data in more than one column or row. For example, in a complex data file, information about a case could be stored in more than one row. So, if you were recording monthly test scores for all students in a class, there would be multiple rows for each student—one for each month.

CASESTOVARS restructures complex data that has multiple rows for a case. You can use it to restructure data in which repeated measurements of a single case were recorded in multiple rows (row groups) into a new data file in which each case appears as separate variables (variable groups) in a single row. It replaces the active dataset.

Options

Automatic classification of fixed variables. The values of fixed variables do not vary within a row group. You can use the AUTOFIX subcommand to let the procedure determine which variables are fixed and which variables are to become variable groups in the new data file.

Naming new variables. You can use the RENAME, SEPARATOR, and INDEX subcommands to control the names for the new variables.

Ordering new variables. You can use the GROUPBY subcommand to specify how to order the new variables in the new data file.

Creating indicator variables. You can use the VIND subcommand to create indicator variables. An indicator variable indicates the presence or absence of a value for a case. An indicator variable has the value of 1 if the case has a value; otherwise, it is 0.

Creating a count variable. You can use the COUNT subcommand to create a count variable that contains the number of rows in the original data that were used to create a row in the new data file.

Variable selection. You can use the DROP subcommand to specify which variables from the original data file are dropped from the new data file.

Basic specification

The basic specification is simply the command keyword.

  • If split-file processing is in effect, the basic specification creates a row in the new data file for each combination of values of the SPLIT FILE variables. If split-file processing is not in effect, the basic specification results in a new data file with one row.
  • Because the basic specification can create quite a few new columns in the new data file, the use of an ID subcommand to identify groups of cases is recommended.

Subcommand order

Subcommands can be specified in any order.

Syntax rules

Each subcommand can be specified only once.

Operations

  • Original row order. CASESTOVARS assumes that the original data are sorted by SPLIT and ID variables.
  • Identifying row groups in the original file. A row group consists of rows in the original data that share the same values of variables listed on the ID subcommand. Row groups are consolidated into a single row in the new data file. Each time a new combination of ID values is encountered, a new row is created.
  • Split-file processing and row groups. If split-file processing is in effect, the split variables are automatically used to identify row groups (they are treated as though they appeared first on the ID subcommand). Split-file processing remains in effect in the new data file unless a variable that is used to split the file is named on the DROP subcommand.
  • New variable groups. A variable group is a group of related columns in the new data file that is created from a variable in the original data. Each variable group contains a variable for each index value or combination of index values encountered.
  • Candidate variables. A variable in the original data is a candidate to become a variable group in the new data file if it is not used on the SPLIT command or the ID, FIXED, or DROP subcommands and its values vary within the row group. Variables named on the SPLIT, ID, and FIXED subcommands are assumed to not vary within the row group and are simply copied into the new data file.
  • New variable names. The names of the variables in a new group are constructed by the procedure. For numeric variables, you can override the default naming convention using the RENAME and SEPARATOR subcommands. If there is a single index variable and it is a string, the string values are used as the new variable names. For string values that do not form valid variable names, names of the general form Vn are used, where n is a sequential integer.
  • New variable formats. With the exception of names and labels, the dictionary information for all of the new variables in a group (for example, value labels and format) is taken from the variable in the original data.
  • New variable order. New variables are created in the order specified by the GROUPBY subcommand.
  • Weighted files. The WEIGHT command does not affect the results of CASESTOVARS. If the original data are weighted, the new data file will be weighted unless the variable that is used as the weight is dropped from the new data file.
  • Selected cases. The FILTER and USE commands do not affect the results of CASESTOVARS. It processes all cases.

Limitations

The TEMPORARY command cannot be in effect when CASESTOVARS is executed.