TRAININGSAMPLE Subcommand (NAIVEBAYES command)

The TRAININGSAMPLE subcommand indicates the method of partitioning the active dataset into training and test samples. You can specify either a percentage of cases to assign to the training sample, or you can specify a variable that indicates whether a case is assigned to the training sample.

  • If TRAININGSAMPLE is not specified, all cases in the active dataset are treated as training data records.
  • TRAININGSAMPLE is automatically excluded from factor and covariate variable lists on the NAIVEBAYES command line.

PERCENT Keyword

The PERCENT keyword specifies the percentage of cases in the active dataset to randomly assign to the training sample. All other cases are assigned to the test sample. The percentage must be a number that is greater than 0 and less than 100. There is no default percentage.

If a weight variable is defined, the PERCENT keyword may not be used.

VARIABLE Keyword

The VARIABLE keyword specifies a variable that indicates which cases in the active dataset are assigned to the training sample. Cases with a value of 1 on the variable are assigned to the training sample. All other cases are assigned to the test sample.

  • The specified variable may not be the dependent variable, the weight variable, any variable that is specified in the factor or covariate lists of the command line, or any variable that is specified in the factor or covariate lists of the FORCE subcommand.
  • The variable must be numeric.