CRITERIA Subcommand (GENLIN command)
The CRITERIA
subcommand controls statistical criteria
for the generalized linear model and specifies numerical tolerance
for checking singularity.
Note that if the REPEATED
subcommand is used,
then the GENLIN
procedure fits generalized estimating
equations, which comprise a generalized linear model and a working
correlation matrix that models within-subject correlations. In this
case, the GENLIN
procedure first fits a generalized
linear model assuming independence and uses the final parameter estimates
as the initial values for the linear model part of the generalized
estimating equations. (See the topic REPEATED Subcommand (GENLIN command) for
more information. ) The description of each CRITERIA
subcommand
keyword below is followed by a statement indicating how the keyword
is affected by specification of the REPEATED
subcommand.
ANALYSISTYPE = 3 | 1 | ALL (WALD | LR). Type of analysis
for each model effect. Specify 1 for a type I analysis, 3 for
type III analysis, or ALL
for both. Each of these
specifications computes chi-square statistics for each model effect.
The default value is 3(WALD)
.
- Optionally,
1
,3
, orALL
may be followed byWALD
orLR
in parentheses to specify the type of chi-square statistics to compute.WALD
computes Wald statistics,LR
computes likelihood-ratio statistics. - If likelihood-ratio statistics are computed, then the log-likelihood
convergence criterion is used in all reduced models if type I analysis
is in effect , or in all constrained models if type III analysis is
in effect, irrespective of the convergence criteria used for parameter
estimation in the full model. That is, for reduced or constrained
models, any
HCONVERGE
andPCONVERGE
specifications are not used, but allLCONVERGE
specifications are used. (See the discussions of theHCONVERGE
,PCONVERGE
, andLCONVERGE
keywords below.) If the log-likelihood convergence criterion is not in effect for the full model, then the reduced or constrained models use the log-likelihood convergence criterion with tolerance level 1E-4 and absolute change. The maximum number of iterations (MAXITERATIONS
), maximum number of step-halvings (MAXSTEPHALVING
), and starting iteration for checking complete and quasi-complete separation (CHECKSEP
) are the same for reduced or constrained models as for the full model. - If the
REPEATED
subcommand is specified, then the option on theANALYSISTYPE
keyword is used for the generalized estimating equations. In this case, theWALD
option computes Wald statistics, but theLR
option computes generalized score statistics instead of likelihood-ratio statistics. For generalized score statistics, the convergence criteria for reduced or constrained models are the same as for the full model; that is,HCONVERGE
orPCONVERGE
as specified on theREPEATED
subcommand.
CHECKSEP = integer. Starting iteration for checking
complete and quasi-complete separation. Specify an integer greater
than or equal to zero. This criterion is not used if the value is
0. The default value is 20. This criterion is used only for the binomial
or multinomial probability distributions (that is, if DISTRIBUTION
= BINOMIAL
or MULTINOMIAL
is specified on
the MODEL
subcommand). For all other probability
distributions, it is silently ignored.
- If the
CHECKSEP
value is greater than 0 and the binomial or multinomial probability distribution is being used, then separation is always checked following the final iteration. - If the
REPEATED
subcommand is specified, then theCHECKSEP
keyword is applicable only to the initial generalized linear model.
CILEVEL = number. Confidence interval level for coefficient estimates and estimated marginal means. Specify a number greater than or equal to 0, and less than 100. The default value is 95.
- If the
REPEATED
subcommand is specified, then theCILEVEL
keyword is applicable to any parameter that is fit in the process of computing the generalized estimating equations.
CITYPE = WALD | PROFILE(number). Confidence interval
type. Specify WALD
for Wald confidence intervals,
or PROFILE
for profile likelilhood confidence intervals.
The default value is WALD
.
-
PROFILE
may be followed optionally by parentheses containing the tolerance level used by the two convergence criteria. The default value is 1E-4. - If the
REPEATED
subcommand is specified, then theCITYPE
keyword is applicable only to the initial generalized linear model. For the linear model part of the generalized estimating equations, Wald confidence intervals are always used.
COVB = MODEL | ROBUST. Parameter estimate covariance
matrix. Specify MODEL
to use the model-based
estimator of the parameter estimate covariance matrix, or ROBUST
to
use the robust estimator. The default value is MODEL
.
- If the
REPEATED
subcommand is specified, then theCRITERIA
subcommandCOVB
keyword is silently ignored. TheREPEATED
subcommandCOVB
keyword is applicable to the linear model part of the generalized estimating equations.
HCONVERGE = number (ABSOLUTE | RELATIVE). Hessian convergence
criterion. Specify a number greater than or equal to 0, and the ABSOLUTE
or RELATIVE
keyword
in parentheses to define the type of convergence. The number and keyword
may be separated by a space character or a comma. If HCONVERGE
= 0
, the absolute Hessian convergence criterion will be checked
with value 1E-4
after any specified convergence criteria
have been satisfied. If it is not met, a warning is displayed. The
default value is 0 (ABSOLUTE)
.
- At least one of the
CRITERIA
subcommand keywordsHCONVERGE
,LCONVERGE
,PCONVERGE
must specify a nonzero number. - For a model with a normal distribution and identity link function,
an iterative process is not used for parameter estimation. Thus, if
DISTRIBUTION = NORMAL
andLINK = IDENTITY
on theMODEL
subcommand, then theHCONVERGE
keyword is silently ignored. - If the
REPEATED
subcommand is specified, then theCRITERIA
subcommandHCONVERGE
keyword is applicable only to the initial generalized linear model. TheREPEATED
subcommandHCONVERGE
keyword is applicable to the linear model part of the generalized estimating equations.
INITIAL = number-list | 'savfile' | 'dataset'. Initial values for parameter estimates. Specify a list of numbers, an external IBM® SPSS® Statistics data file, or a currently open dataset. If a list of numbers is specified, then each number must be separated by a space character or a comma. If an external data file is specified, then the full path and filename must be given in quotes.
- If the
INITIAL
keyword is specified, then initial values must be supplied for all parameters (including redundant parameters) in the generalized linear model. The ordering of the initial values should correspond to the ordering of the model parameters used by theGENLIN
procedure. One way to determine how parameters are ordered for a given model is to run theGENLIN
procedure for the model – without theINITIAL
keyword – and examine thePRINT
subcommandSOLUTION
output. - If
INITIAL
is not specified, then theGENLIN
procedure automatically determines the initial values. - If
DISTRIBUTION = NORMAL
andLINK = IDENTITY
on theMODEL
subcommand, then theINITIAL
keyword is ignored with a warning. - If the
REPEATED
subcommand is specified, then theCRITERIA
subcommandINITIAL
keyword is applicable only to the initial generalized linear model. See theREPEATED
subcommand below for a detailed discussion of initial values and generalized estimating equations.
Initial Values Specified using a List of Numbers
For
all distributions except multinomial, if MODEL INTERCEPT =
YES
, then the initial values must begin with the initial
value for the intercept parameter. If MODEL INTERCEPT = NO
,
then the initial values must begin with the initial value for the
first regression parameter.
If SCALE = MLE
,
then the initial values must continue with the initial value for the
scale parameter. If SCALE = DEVIANCE
, PEARSON
,
or a fixed number, then a value may be given for the scale parameter
but it is optional and always silently ignored.
Finally, if DISTRIBUTION
= NEGBIN(MLE)
, then the initial values may end with an initial
value for the negative binomial distribution’s ancillary parameter.
The initial value for this parameter must be specified as NEGBIN(number)
,
where number is a number greater than or equal to zero. The default
value is 1. If DISTRIBUTION = NEGBIN(MLE)
is not
in effect, then NEGBIN(number)
is silently ignored.
For the multinomial distribution, the ordering of initial values is: threshold parameters, regression parameters.
Any additional unused numbers at the end of the list; that is, any numbers beyond those that are mapped to parameters, are silently ignored.
If the SPLIT
FILE
command is in effect, then the exact same list is applied
to all splits. That is, each split must have the same set of parameters,
and the same list is applied to each split. If the list contains
too few or too many numbers for any split, then an error message is
displayed.
Initial Values Specified using a Dataset or External IBM SPSS Statistics Data File
If a currently open dataset or external IBM SPSS Statistics data
file is specified, then the file structure must be the same as that
used in the OUTFILE
subcommand CORB
and COVB
files.
This structure allows the final values from one run of the GENLIN
procedure
to be saved in a CORB
or COVB
file
and input as initial values in a subsequent run of the procedure.
In the dataset, the ordering of variables from left to right must be: RowType_, VarName_, P1, P2, …. The variables RowType_ and VarName_ are string variables. P1, P2, … are numeric variables corresponding to an ordered list of the parameters. (Variable names P1, P2, … are not required; the procedure will accept any valid variable names for the parameters. The mapping of variables to parameters is based on variable position, not variable name.) Any variables beyond the last parameter are ignored.
Initial values are supplied on a
record with value ‘EST’ for variable RowType_; the actual initial
values are given under variables P1, P2, …. The GENLIN
procedure
ignores all records for which RowType_ has a value other than
‘EST’, as well as any records beyond the first occurrence of RowType_ equal
to ‘EST’.
The required order of the intercept (if any) or threshold
parameters, and regression parameters, is the same as for the list
of numbers specification. However, when initial values are entered
via a dataset or external data file, these parameters must always
be followed by the scale parameter and then, if DISTRIBUTION
= NEGBIN
, by the negative binomial parameter.
If SPLIT
FILE
is in effect, then the variables must begin with the
split-file variable or variables in the order specified on the SPLIT
FILE
command, followed by RowType_, VarName_, P1, P2,
… as above. Splits must occur in the specified dataset in the same
order as in the original dataset.
Examples
GENLIN depvar BY a WITH x
/MODEL a x
/CRITERIA INITIAL = 1 1.5 2.5 0 3.
The
next example outputs the final estimates from one run of the GENLIN
procedure
and inputs these estimates as the initial values in the second run.
GENLIN depvar BY a WITH x
/MODEL a x
/OUTFILE COVB = '/work/estimates.sav'.
GENLIN depvar BY a WITH x
/MODEL a x
/CRITERIA INITIAL = '/work/estimates.sav'.
LCONVERGE
= number (ABSOLUTE | RELATIVE). Log-likelihood convergence
criterion. Specify a number greater than or equal to 0, and the ABSOLUTE
or RELATIVE
keyword
in parentheses to define the type of convergence. The number and keyword
may be separated by a space character or a comma. The log-likelihood
convergence criterion is not used if the number is 0. The default
value is 0 (ABSOLUTE)
.
- At least one of the
CRITERIA
subcommand keywordsHCONVERGE
,LCONVERGE
,PCONVERGE
must specify a nonzero number. - If
DISTRIBUTION = NORMAL
andLINK = IDENTITY
on theMODEL
subcommand, then theLCONVERGE
keyword is silently ignored. - If the
REPEATED
subcommand is specified, then theLCONVERGE
keyword is applicable only to the initial generalized linear model.
LIKELIHOOD = FULL | KERNEL. Form of the log-likelihood
or log-quasi-likelihood function. Specify FULL
for
the full function, or KERNEL
for the kernel of the
function. The default value is FULL
.
- For generalized linear models, the
LIKELIHOOD
keyword specifies the form of the log likelihood function. If theREPEATED
subcommand is specified, then it specifies the form of the log quasi-likelihood function.
MAXITERATIONS = integer. Maximum number of iterations. Specify
an integer greater than or equal to 0. The default value is 100.If DISTRIBUTION
= NORMAL
and LINK = IDENTITY
on the MODEL
subcommand,
then the MAXITERATIONS
keyword is silently ignored.
- If the
REPEATED
subcommand is specified, then theCRITERIA
subcommandMAXITERATIONS
keyword is applicable only to the initial generalized linear model. TheREPEATED
subcommandMAXITERATIONS
keyword is applicable to the linear model part of the generalized estimating equations.
MAXSTEPHALVING = integer. Maximum number of steps
in step-halving method. Specify an integer greater than 0. The
default value is 5.If DISTRIBUTION = NORMAL
and LINK
= IDENTITY
on the MODEL
subcommand, then
the MAXSTEPHALVING
keyword is silently ignored.
- If the
REPEATED
subcommand is specified, then theMAXSTEPHALVING
keyword is applicable only to the initial generalized linear model.
METHOD = FISHER | NEWTON | FISHER(integer). Model
parameters estimation method. Specify FISHER
to
use the Fisher scoring method, NEWTON
to use the
Newton-Raphson method, or FISHER(integer)
to use
a hybrid method.
- In the hybrid method option, integer is an integer greater than 0 and specifies the maximum number of Fisher scoring iterations before switching to the Newton-Raphson method. If convergence is achieved during the Fisher scoring phase of the hybrid method, then additional Newton-Raphson steps are performed until convergence is achieved for Newton-Raphson too.
- The default algorithm for the generalized linear model uses Fisher
scoring in the first iteration and Newton-Raphson thereafter; the
default value for the
METHOD
keyword isFISHER(1)
. - If
DISTRIBUTION = NORMAL
andLINK = IDENTITY
on theMODEL
subcommand, then theMETHOD
keyword is silently ignored. - If the
REPEATED
subcommand is specified, then theMETHOD
keyword is applicable only to the initial generalized linear model.
PCONVERGE = number (ABSOLUTE | RELATIVE). Parameter
convergence criterion. Specify a number greater than or equal
to 0, and the ABSOLUTE
or RELATIVE
keyword
in parentheses to define the type of convergence. The number and keyword
may be separated by a space character or a comma. The parameter convergence
criterion is not used if the number is 0. The default value is 1E-6
(ABSOLUTE)
.
- At least one of the
CRITERIA
subcommand keywordsHCONVERGE
,LCONVERGE
,PCONVERGE
must specify a nonzero number. - If
DISTRIBUTION = NORMAL
andLINK = IDENTITY
on theMODEL
subcommand, then thePCONVERGE
keyword is silently ignored. - If the
REPEATED
subcommand is specified, then theCRITERIA
subcommandPCONVERGE
keyword is applicable only to the initial generalized linear model. TheREPEATED
subcommandPCONVERGE
keyword is applicable to the linear model part of the generalized estimating equations.
SCALE = MLE | DEVIANCE | PEARSON | number. Method
of fitting the scale parameter. Specify MLE
to
compute a maximum likelihood estimate, DEVIANCE
to
compute the scale parameter using the deviance, PEARSON
to
compute it using the Pearson chi-square, or a number greater than
0 to fix the scale parameter.
If the MODEL
subcommand
specifies DISTRIBUTION = NORMAL
, IGAUSS
, GAMMA
,
or TWEEDIE
then any of the SCALE
options
may be used. For these distributions, the default value is MLE
.
If
the MODEL
subcommand specifies DISTRIBUTION
= NEGBIN
, POISSON
, BINOMIAL
,
or MULTINOMIAL
, then DEVIANCE
, PEARSON
,
or a fixed number may be used. For these distributions, the default
value is the fixed number 1.
If the REPEATED
subcommand
is specified, then the SCALE
keyword is directly
applicable only to the initial generalized linear model. For the linear
model part of the generalized estimating equations, the scale parameter
is treated as follows:
- If
SCALE = MLE
, then the scale parameter estimate from the initial generalized linear model is passed to the generalized estimating equations, where it is updated by the Pearson chi-square divided by its degrees of freedom. - If
SCALE = DEVIANCE
orPEARSON
, then the scale parameter estimate from the initial generalized linear model is passed to the generalized estimating equations, where it is treated as a fixed number. - If
SCALE
is specified with a fixed number, then the scale parameter is also held fixed at the same number in the generalized estimating equations.
SINGULAR = number. Tolerance value used to test for singularity. Specify a number greater than 0. The default value is 1E-12.
- If the
REPEATED
subcommand is specified, then theSINGULAR
keyword is applicable to any linear model that is fit in the process of computing the generalized estimating equations.