CRITERIA Subcommand (GENLIN command)

The CRITERIA subcommand controls statistical criteria for the generalized linear model and specifies numerical tolerance for checking singularity.

Note that if the REPEATED subcommand is used, then the GENLIN procedure fits generalized estimating equations, which comprise a generalized linear model and a working correlation matrix that models within-subject correlations. In this case, the GENLIN procedure first fits a generalized linear model assuming independence and uses the final parameter estimates as the initial values for the linear model part of the generalized estimating equations. (See the topic REPEATED Subcommand (GENLIN command) for more information. ) The description of each CRITERIA subcommand keyword below is followed by a statement indicating how the keyword is affected by specification of the REPEATED subcommand.

ANALYSISTYPE = 3 | 1 | ALL (WALD | LR). Type of analysis for each model effect. Specify 1 for a type I analysis, 3 for type III analysis, or ALL for both. Each of these specifications computes chi-square statistics for each model effect. The default value is 3(WALD).

  • Optionally, 1, 3, or ALL may be followed by WALD or LR in parentheses to specify the type of chi-square statistics to compute. WALD computes Wald statistics, LR computes likelihood-ratio statistics.
  • If likelihood-ratio statistics are computed, then the log-likelihood convergence criterion is used in all reduced models if type I analysis is in effect , or in all constrained models if type III analysis is in effect, irrespective of the convergence criteria used for parameter estimation in the full model. That is, for reduced or constrained models, any HCONVERGE and PCONVERGE specifications are not used, but all LCONVERGE specifications are used. (See the discussions of the HCONVERGE, PCONVERGE, and LCONVERGE keywords below.) If the log-likelihood convergence criterion is not in effect for the full model, then the reduced or constrained models use the log-likelihood convergence criterion with tolerance level 1E-4 and absolute change. The maximum number of iterations (MAXITERATIONS), maximum number of step-halvings (MAXSTEPHALVING), and starting iteration for checking complete and quasi-complete separation (CHECKSEP) are the same for reduced or constrained models as for the full model.
  • If the REPEATED subcommand is specified, then the option on the ANALYSISTYPE keyword is used for the generalized estimating equations. In this case, the WALD option computes Wald statistics, but the LR option computes generalized score statistics instead of likelihood-ratio statistics. For generalized score statistics, the convergence criteria for reduced or constrained models are the same as for the full model; that is, HCONVERGE or PCONVERGE as specified on the REPEATED subcommand.

CHECKSEP = integer. Starting iteration for checking complete and quasi-complete separation. Specify an integer greater than or equal to zero. This criterion is not used if the value is 0. The default value is 20. This criterion is used only for the binomial or multinomial probability distributions (that is, if DISTRIBUTION = BINOMIAL or MULTINOMIAL is specified on the MODEL subcommand). For all other probability distributions, it is silently ignored.

  • If the CHECKSEP value is greater than 0 and the binomial or multinomial probability distribution is being used, then separation is always checked following the final iteration.
  • If the REPEATED subcommand is specified, then the CHECKSEP keyword is applicable only to the initial generalized linear model.

CILEVEL = number. Confidence interval level for coefficient estimates and estimated marginal means. Specify a number greater than or equal to 0, and less than 100. The default value is 95.

  • If the REPEATED subcommand is specified, then the CILEVEL keyword is applicable to any parameter that is fit in the process of computing the generalized estimating equations.

CITYPE = WALD | PROFILE(number). Confidence interval type. Specify WALD for Wald confidence intervals, or PROFILE for profile likelilhood confidence intervals. The default value is WALD.

  • PROFILE may be followed optionally by parentheses containing the tolerance level used by the two convergence criteria. The default value is 1E-4.
  • If the REPEATED subcommand is specified, then the CITYPE keyword is applicable only to the initial generalized linear model. For the linear model part of the generalized estimating equations, Wald confidence intervals are always used.

COVB = MODEL | ROBUST. Parameter estimate covariance matrix. Specify MODEL to use the model-based estimator of the parameter estimate covariance matrix, or ROBUST to use the robust estimator. The default value is MODEL.

  • If the REPEATED subcommand is specified, then the CRITERIA subcommand COVB keyword is silently ignored. The REPEATED subcommand COVB keyword is applicable to the linear model part of the generalized estimating equations.

HCONVERGE = number (ABSOLUTE | RELATIVE). Hessian convergence criterion. Specify a number greater than or equal to 0, and the ABSOLUTE or RELATIVE keyword in parentheses to define the type of convergence. The number and keyword may be separated by a space character or a comma. If HCONVERGE = 0, the absolute Hessian convergence criterion will be checked with value 1E-4 after any specified convergence criteria have been satisfied. If it is not met, a warning is displayed. The default value is 0 (ABSOLUTE).

  • At least one of the CRITERIA subcommand keywords HCONVERGE, LCONVERGE, PCONVERGE must specify a nonzero number.
  • For a model with a normal distribution and identity link function, an iterative process is not used for parameter estimation. Thus, if DISTRIBUTION = NORMAL and LINK = IDENTITY on the MODEL subcommand, then the HCONVERGE keyword is silently ignored.
  • If the REPEATED subcommand is specified, then the CRITERIA subcommand HCONVERGE keyword is applicable only to the initial generalized linear model. The REPEATED subcommand HCONVERGE keyword is applicable to the linear model part of the generalized estimating equations.

INITIAL = number-list | 'savfile' | 'dataset'. Initial values for parameter estimates. Specify a list of numbers, an external IBM® SPSS® Statistics data file, or a currently open dataset. If a list of numbers is specified, then each number must be separated by a space character or a comma. If an external data file is specified, then the full path and filename must be given in quotes.

  • If the INITIAL keyword is specified, then initial values must be supplied for all parameters (including redundant parameters) in the generalized linear model. The ordering of the initial values should correspond to the ordering of the model parameters used by the GENLIN procedure. One way to determine how parameters are ordered for a given model is to run the GENLIN procedure for the model – without the INITIAL keyword – and examine the PRINT subcommand SOLUTION output.
  • If INITIAL is not specified, then the GENLIN procedure automatically determines the initial values.
  • If DISTRIBUTION = NORMAL and LINK = IDENTITY on the MODEL subcommand, then the INITIAL keyword is ignored with a warning.
  • If the REPEATED subcommand is specified, then the CRITERIA subcommand INITIAL keyword is applicable only to the initial generalized linear model. See the REPEATED subcommand below for a detailed discussion of initial values and generalized estimating equations.

Initial Values Specified using a List of Numbers

For all distributions except multinomial, if MODEL INTERCEPT = YES, then the initial values must begin with the initial value for the intercept parameter. If MODEL INTERCEPT = NO, then the initial values must begin with the initial value for the first regression parameter.

If SCALE = MLE, then the initial values must continue with the initial value for the scale parameter. If SCALE = DEVIANCE, PEARSON, or a fixed number, then a value may be given for the scale parameter but it is optional and always silently ignored.

Finally, if DISTRIBUTION = NEGBIN(MLE), then the initial values may end with an initial value for the negative binomial distribution’s ancillary parameter. The initial value for this parameter must be specified as NEGBIN(number), where number is a number greater than or equal to zero. The default value is 1. If DISTRIBUTION = NEGBIN(MLE) is not in effect, then NEGBIN(number) is silently ignored.

For the multinomial distribution, the ordering of initial values is: threshold parameters, regression parameters.

Any additional unused numbers at the end of the list; that is, any numbers beyond those that are mapped to parameters, are silently ignored.

If the SPLIT FILE command is in effect, then the exact same list is applied to all splits. That is, each split must have the same set of parameters, and the same list is applied to each split. If the list contains too few or too many numbers for any split, then an error message is displayed.

Initial Values Specified using a Dataset or External IBM SPSS Statistics Data File

If a currently open dataset or external IBM SPSS Statistics data file is specified, then the file structure must be the same as that used in the OUTFILE subcommand CORB and COVB files. This structure allows the final values from one run of the GENLIN procedure to be saved in a CORB or COVB file and input as initial values in a subsequent run of the procedure.

In the dataset, the ordering of variables from left to right must be: RowType_, VarName_, P1, P2, …. The variables RowType_ and VarName_ are string variables. P1, P2, … are numeric variables corresponding to an ordered list of the parameters. (Variable names P1, P2, … are not required; the procedure will accept any valid variable names for the parameters. The mapping of variables to parameters is based on variable position, not variable name.) Any variables beyond the last parameter are ignored.

Initial values are supplied on a record with value ‘EST’ for variable RowType_; the actual initial values are given under variables P1, P2, …. The GENLIN procedure ignores all records for which RowType_ has a value other than ‘EST’, as well as any records beyond the first occurrence of RowType_ equal to ‘EST’.

The required order of the intercept (if any) or threshold parameters, and regression parameters, is the same as for the list of numbers specification. However, when initial values are entered via a dataset or external data file, these parameters must always be followed by the scale parameter and then, if DISTRIBUTION = NEGBIN, by the negative binomial parameter.

If SPLIT FILE is in effect, then the variables must begin with the split-file variable or variables in the order specified on the SPLIT FILE command, followed by RowType_, VarName_, P1, P2, … as above. Splits must occur in the specified dataset in the same order as in the original dataset.

Examples

GENLIN depvar BY a WITH x
  /MODEL a x
  /CRITERIA INITIAL = 1 1.5 2.5 0 3.

The next example outputs the final estimates from one run of the GENLIN procedure and inputs these estimates as the initial values in the second run.

GENLIN depvar BY a WITH x
  /MODEL a x
  /OUTFILE COVB = '/work/estimates.sav'.
GENLIN depvar BY a WITH x
  /MODEL a x
  /CRITERIA INITIAL = '/work/estimates.sav'. 

LCONVERGE = number (ABSOLUTE | RELATIVE). Log-likelihood convergence criterion. Specify a number greater than or equal to 0, and the ABSOLUTE or RELATIVE keyword in parentheses to define the type of convergence. The number and keyword may be separated by a space character or a comma. The log-likelihood convergence criterion is not used if the number is 0. The default value is 0 (ABSOLUTE).

  • At least one of the CRITERIA subcommand keywords HCONVERGE, LCONVERGE, PCONVERGE must specify a nonzero number.
  • If DISTRIBUTION = NORMAL and LINK = IDENTITY on the MODEL subcommand, then the LCONVERGE keyword is silently ignored.
  • If the REPEATED subcommand is specified, then the LCONVERGE keyword is applicable only to the initial generalized linear model.

LIKELIHOOD = FULL | KERNEL. Form of the log-likelihood or log-quasi-likelihood function. Specify FULL for the full function, or KERNEL for the kernel of the function. The default value is FULL.

  • For generalized linear models, the LIKELIHOOD keyword specifies the form of the log likelihood function. If the REPEATED subcommand is specified, then it specifies the form of the log quasi-likelihood function.

MAXITERATIONS = integer. Maximum number of iterations. Specify an integer greater than or equal to 0. The default value is 100.If DISTRIBUTION = NORMAL and LINK = IDENTITY on the MODEL subcommand, then the MAXITERATIONS keyword is silently ignored.

  • If the REPEATED subcommand is specified, then the CRITERIA subcommand MAXITERATIONS keyword is applicable only to the initial generalized linear model. The REPEATED subcommand MAXITERATIONS keyword is applicable to the linear model part of the generalized estimating equations.

MAXSTEPHALVING = integer. Maximum number of steps in step-halving method. Specify an integer greater than 0. The default value is 5.If DISTRIBUTION = NORMAL and LINK = IDENTITY on the MODEL subcommand, then the MAXSTEPHALVING keyword is silently ignored.

  • If the REPEATED subcommand is specified, then the MAXSTEPHALVING keyword is applicable only to the initial generalized linear model.

METHOD = FISHER | NEWTON | FISHER(integer). Model parameters estimation method. Specify FISHER to use the Fisher scoring method, NEWTON to use the Newton-Raphson method, or FISHER(integer) to use a hybrid method.

  • In the hybrid method option, integer is an integer greater than 0 and specifies the maximum number of Fisher scoring iterations before switching to the Newton-Raphson method. If convergence is achieved during the Fisher scoring phase of the hybrid method, then additional Newton-Raphson steps are performed until convergence is achieved for Newton-Raphson too.
  • The default algorithm for the generalized linear model uses Fisher scoring in the first iteration and Newton-Raphson thereafter; the default value for the METHOD keyword is FISHER(1).
  • If DISTRIBUTION = NORMAL and LINK = IDENTITY on the MODEL subcommand, then the METHOD keyword is silently ignored.
  • If the REPEATED subcommand is specified, then the METHOD keyword is applicable only to the initial generalized linear model.

PCONVERGE = number (ABSOLUTE | RELATIVE). Parameter convergence criterion. Specify a number greater than or equal to 0, and the ABSOLUTE or RELATIVE keyword in parentheses to define the type of convergence. The number and keyword may be separated by a space character or a comma. The parameter convergence criterion is not used if the number is 0. The default value is 1E-6 (ABSOLUTE).

  • At least one of the CRITERIA subcommand keywords HCONVERGE, LCONVERGE, PCONVERGE must specify a nonzero number.
  • If DISTRIBUTION = NORMAL and LINK = IDENTITY on the MODEL subcommand, then the PCONVERGE keyword is silently ignored.
  • If the REPEATED subcommand is specified, then the CRITERIA subcommand PCONVERGE keyword is applicable only to the initial generalized linear model. The REPEATED subcommand PCONVERGE keyword is applicable to the linear model part of the generalized estimating equations.

SCALE = MLE | DEVIANCE | PEARSON | number. Method of fitting the scale parameter. Specify MLE to compute a maximum likelihood estimate, DEVIANCE to compute the scale parameter using the deviance, PEARSON to compute it using the Pearson chi-square, or a number greater than 0 to fix the scale parameter.

If the MODEL subcommand specifies DISTRIBUTION = NORMAL, IGAUSS, GAMMA, or TWEEDIE then any of the SCALE options may be used. For these distributions, the default value is MLE.

If the MODEL subcommand specifies DISTRIBUTION = NEGBIN, POISSON, BINOMIAL, or MULTINOMIAL, then DEVIANCE, PEARSON, or a fixed number may be used. For these distributions, the default value is the fixed number 1.

If the REPEATED subcommand is specified, then the SCALE keyword is directly applicable only to the initial generalized linear model. For the linear model part of the generalized estimating equations, the scale parameter is treated as follows:

  • If SCALE = MLE, then the scale parameter estimate from the initial generalized linear model is passed to the generalized estimating equations, where it is updated by the Pearson chi-square divided by its degrees of freedom.
  • If SCALE = DEVIANCE or PEARSON, then the scale parameter estimate from the initial generalized linear model is passed to the generalized estimating equations, where it is treated as a fixed number.
  • If SCALE is specified with a fixed number, then the scale parameter is also held fixed at the same number in the generalized estimating equations.

SINGULAR = number. Tolerance value used to test for singularity. Specify a number greater than 0. The default value is 1E-12.

  • If the REPEATED subcommand is specified, then the SINGULAR keyword is applicable to any linear model that is fit in the process of computing the generalized estimating equations.