Overview (AUTORECODE command)
AUTORECODE recodes the values of string and numeric variables to consecutive
integers and puts the recoded values into a new variable called a target variable. The value labels
or values of the original variable are used as value labels for the
target variable. AUTORECODE is
useful for creating numeric independent (grouping) variables from
string variables for procedures such as ONEWAY and DISCRIMINANT. AUTORECODE can also recode the values of
factor variables to consecutive integers, which may be required by
some procedures and which reduces the amount of workspace needed by
some statistical procedures.
Basic Specification
The
basic specification is VARIABLES and INTO. VARIABLES specifies the variables to be recoded. INTO provides names for the target variables
that store the new values. VARIABLES and INTO must name or imply
the same number of variables.
Subcommand Order
-
VARIABLESmust be specified first. -
INTOmust immediately followVARIABLES. - All other subcommands can be specified in any order.
Syntax Rules
- A variable cannot be recoded into itself. More generally, target variable names cannot duplicate any variable names already in the working file.
- If the
GROUPorAPPLY TEMPLATEsubcommand is specified, all variables on theVARIABLESsubcommand must be the same type (numeric or string). - If
APPLY TEMPLATEis specified, all variables on theVARIABLESsubcommand must be the same type (numeric or string) as the type defined in the template. - File specifications
on the
APPLY TEMPLATEandSAVE TEMPLATEsubcommands follow the normal conventions for file specifications. Enclosing file specifications in quotation marks is recommended.
Operations
- The values of each variable to be recoded are sorted and then assigned numeric values. By default, the values are assigned in ascending order: 1 is assigned to the lowest nonmissing value of the original variable; 2, to the second-lowest nonmissing value; and so on, for each value of the original variable.
- Values of the original variables are unchanged.
- Missing values are recoded
into values higher than any nonmissing values, with their order preserved.
For example, if the original variable has 10 nonmissing values, the
first missing value is recoded as 11 and retains its user-missing
status. System-missing values remain system-missing. (See the
GROUP,APPLY TEMPLATE, andSAVE TEMPLATEsubcommands for additional rules for user-missing values.) -
AUTORECODEdoes not sort the cases in the working file. As a result, the consecutive numbers assigned to the target variables may not be in order in the file. - Target variables are assigned the same variable labels
as the original source variables. To change the variable labels, use
the
VARIABLE LABELScommand afterAUTORECODE. - Value labels are automatically generated for each value of the target variables. If the original value had a label, that label is used for the corresponding new value. If the original value did not have a label, the old value itself is used as the value label for the new value. The defined print format of the old value is used to create the new value label.
-
AUTORECODEignoresSPLIT FILEspecifications. However, anySELECT IFspecifications are in effect forAUTORECODE.