Overview (AUTORECODE command)

AUTORECODE recodes the values of string and numeric variables to consecutive integers and puts the recoded values into a new variable called a target variable. The value labels or values of the original variable are used as value labels for the target variable. AUTORECODE is useful for creating numeric independent (grouping) variables from string variables for procedures such as ONEWAY and DISCRIMINANT. AUTORECODE can also recode the values of factor variables to consecutive integers, which may be required by some procedures and which reduces the amount of workspace needed by some statistical procedures.

Basic Specification

The basic specification is VARIABLES and INTO. VARIABLES specifies the variables to be recoded. INTO provides names for the target variables that store the new values. VARIABLES and INTO must name or imply the same number of variables.

Subcommand Order

  • VARIABLES must be specified first.
  • INTO must immediately follow VARIABLES.
  • All other subcommands can be specified in any order.

Syntax Rules

  • A variable cannot be recoded into itself. More generally, target variable names cannot duplicate any variable names already in the working file.
  • If the GROUP or APPLY TEMPLATE subcommand is specified, all variables on the VARIABLES subcommand must be the same type (numeric or string).
  • If APPLY TEMPLATE is specified, all variables on the VARIABLES subcommand must be the same type (numeric or string) as the type defined in the template.
  • File specifications on the APPLY TEMPLATE and SAVE TEMPLATE subcommands follow the normal conventions for file specifications. Enclosing file specifications in quotation marks is recommended.

Operations

  • The values of each variable to be recoded are sorted and then assigned numeric values. By default, the values are assigned in ascending order: 1 is assigned to the lowest nonmissing value of the original variable; 2, to the second-lowest nonmissing value; and so on, for each value of the original variable.
  • Values of the original variables are unchanged.
  • Missing values are recoded into values higher than any nonmissing values, with their order preserved. For example, if the original variable has 10 nonmissing values, the first missing value is recoded as 11 and retains its user-missing status. System-missing values remain system-missing. (See the GROUP, APPLY TEMPLATE, and SAVE TEMPLATE subcommands for additional rules for user-missing values.)
  • AUTORECODE does not sort the cases in the working file. As a result, the consecutive numbers assigned to the target variables may not be in order in the file.
  • Target variables are assigned the same variable labels as the original source variables. To change the variable labels, use the VARIABLE LABELS command after AUTORECODE.
  • Value labels are automatically generated for each value of the target variables. If the original value had a label, that label is used for the corresponding new value. If the original value did not have a label, the old value itself is used as the value label for the new value. The defined print format of the old value is used to create the new value label.
  • AUTORECODE ignores SPLIT FILE specifications. However, any SELECT IF specifications are in effect for AUTORECODE.