Overview (CASESTOVARS command)
A variable contains information that you want to analyze, such as a measurement or a test score. A case is an observation, such as an individual or an institution.
In a simple data file, each variable is a single column in your data, and each case is a single row in your data. So, if you were recording the score on a test for all students in a class, the scores would appear in only one column and there would be only one row for each student.
Complex data files store data in more than one column or row. For example, in a complex data file, information about a case could be stored in more than one row. So, if you were recording monthly test scores for all students in a class, there would be multiple rows for each student—one for each month.
CASESTOVARS
restructures
complex data that has multiple rows for a case. You can use it to
restructure data in which repeated measurements of a single case were
recorded in multiple rows (row groups) into a new data file in which
each case appears as separate variables (variable groups) in a single
row. It replaces the active dataset.
Options
Automatic classification of fixed variables. The values of fixed variables do not vary within a row group. You can use the AUTOFIX
subcommand to let the procedure
determine which variables are fixed and which variables are to become
variable groups in the new data file.
Naming new variables. You can use the RENAME
, SEPARATOR
, and INDEX
subcommands to control
the names for the new variables.
Ordering new variables. You can use the GROUPBY
subcommand to specify how to order
the new variables in the new data file.
Creating indicator variables. You can use the VIND
subcommand
to create indicator variables. An indicator
variable indicates the presence or absence of a value
for a case. An indicator variable has the value of 1 if the case has
a value; otherwise, it is 0.
Creating a count variable. You can use
the COUNT
subcommand to create
a count variable that contains the number of rows in the original
data that were used to create a row in the new data file.
Variable selection. You can use the DROP
subcommand
to specify which variables from the original data file are dropped
from the new data file.
Basic specification
The basic specification is simply the command keyword.
- If split-file processing is in effect, the basic specification creates
a row in the new data file for each combination of values of the
SPLIT FILE
variables. If split-file processing is not in effect, the basic specification results in a new data file with one row. - Because the basic specification
can create quite a few new columns in the new data file, the use of
an
ID
subcommand to identify groups of cases is recommended.
Subcommand order
Subcommands can be specified in any order.
Syntax rules
Each subcommand can be specified only once.
Operations
- Original row order.
CASESTOVARS
assumes that the original data are sorted bySPLIT
andID
variables. - Identifying row groups in the original file. A row group consists of
rows in the original data that share the same values of variables
listed on the
ID
subcommand. Row groups are consolidated into a single row in the new data file. Each time a new combination ofID
values is encountered, a new row is created. - Split-file processing and row groups. If split-file processing is in effect, the split variables are automatically
used to identify row groups (they are treated as though they appeared
first on the
ID
subcommand). Split-file processing remains in effect in the new data file unless a variable that is used to split the file is named on theDROP
subcommand. - New variable groups. A variable group is a group of related columns in the new data file that is created from a variable in the original data. Each variable group contains a variable for each index value or combination of index values encountered.
- Candidate variables. A variable
in the original data is a candidate to become a variable group in
the new data file if it is not used on the
SPLIT
command or theID
,FIXED
, orDROP
subcommands and its values vary within the row group. Variables named on theSPLIT
,ID
, andFIXED
subcommands are assumed to not vary within the row group and are simply copied into the new data file. - New variable names. The names of the variables in a new group are constructed by the
procedure. For numeric variables, you can override the default naming
convention using the
RENAME
andSEPARATOR
subcommands. If there is a single index variable and it is a string, the string values are used as the new variable names. For string values that do not form valid variable names, names of the general form Vn are used, where n is a sequential integer. - New variable formats. With the exception of names and labels, the dictionary information for all of the new variables in a group (for example, value labels and format) is taken from the variable in the original data.
- New variable
order. New variables are created in the order specified
by the
GROUPBY
subcommand. - Weighted files. The
WEIGHT
command does not affect the results ofCASESTOVARS
. If the original data are weighted, the new data file will be weighted unless the variable that is used as the weight is dropped from the new data file. - Selected
cases. The
FILTER
andUSE
commands do not affect the results ofCASESTOVARS
. It processes all cases.
Limitations
The TEMPORARY
command cannot be in effect when CASESTOVARS
is executed.