DEPCATEGORIES Subcommand (TREE command)

For categorical dependent variables, the DEPCATEGORIES subcommand controls the categories of the dependent variable included in the model and/or specifies target categories.

  • By default, all valid categories are used in the analysis and user-missing values are excluded.
  • There is no default target category.
  • DEPCATEGORIES applies only to categorical dependent variables. If the dependent variable is scale, the subcommand is ignored and a warning is issued.

Example

TREE risk [n] BY income [o] age [s] creditscore [s] employment [n]
  /DEPCATEGORIES USEVALUES=[VALID 99] TARGET=[1].

USEVALUES keyword

USEVALUES controls the categories of the dependent variable included in the model.

  • The keyword is followed by an equals sign (=) and a list of values enclosed in square brackets, as in: USEVALUES=[1 2 3].
  • At least two values must be specified.
  • Any cases with dependent variable values not included in the list are excluded from the analysis.
  • Values can be string or numeric but must be consistent with the data type of the dependent variable. String and date values must be quoted. Date values must be consistent with the variable’s print format.
  • If the dependent variable is nominal, the list can include user-missing values. If the dependent variable is ordinal, any user-missing values in the list are excluded and a warning is issued.
  • The keywords VALID and MISSING can be used to specify all valid values and all user-missing values respectively. For example, USEVALUES=[VALID MISSING] will include all cases with valid or user-missing values for the dependent variable.
  • A warning is issued if a specified category does not exist in the data or in the training sample if split-sample validation is in effect. See the topic VALIDATION Subcommand (TREE command) for more information.

TARGET Keyword

The TARGET keyword specifies categories of the dependent variable that are of primary interest in the analysis. For example, if you are trying to predict mortality, "death" would be defined as the target category. If you are trying to identify individuals with high or medium credit risk, you would define both "high credit risk" and "medium credit risk" as target categories.

  • The keyword is followed by an equals sign (=) and a list of values enclosed in square brackets, as in: TARGET=[1].
  • There is no default target category. If not specified, some classification rule options and gains-related output are not available.
  • Values can be string or numeric but must be consistent with the data type of the dependent variable. String and date values must be quoted. Date values must be consistent with the variable’s print format.
  • If USEVALUES is also specified, the target categories must also be included (either implicitly or explicitly) in the USEVALUES list.
  • If a target category is user-missing and the dependent variable is not nominal, the target category is ignored and a warning is issued.