Overview (GENLIN command)
The GENLIN
procedure fits the generalized linear model and generalized estimating
equations.
The generalized linear model includes one dependent variable and usually one or more independent effects. Subjects are assumed to be independent. The generalized linear model covers not only widely used statistical models such as linear regression for normally distributed responses, logistic models for binary data, and loglinear models for count data, but also many other statistical models via its very general model formulation. However, the independence assumption prohibits the model from being applied to correlated data.
Generalized estimating equations extend the generalized linear model to correlated longitudinal data and clustered data. More particularly, generalized estimating equations model correlations within subjects. Data across subjects are still assumed independent.
Options
Independence Assumption. The GENLIN
procedure fits either the generalized
linear model assuming independence across subjects, or generalized
estimating equations assuming correlated measurements within subjects
but independence across subjects.
Events/Trials Specification for Binomial Distribution. The typical dependent variable specification will be a single variable, but for the binomial distribution the dependent variable may be specified using a number-of-events variable and a number-of-trials variable. Alternatively, if the number of trials is the same across all subjects, then trials may be specified using a fixed number instead of a variable.
Probability Distribution of Dependent Variable. The probability distribution of the dependent variable may be specified as normal, binomial, gamma, inverse Gaussian, multinomial, negative binomial, Poisson, or Tweedie.
Link Function.
GENLIN
offers the following link functions: Identity,
complementary log-log, log, log-complement, logit, negative binomial,
negative log-log, odds power, power, and probit. For the multinomial
distribution, the following link functions are available: cumulative
Cauchit, cumulative complementary log-log, cumulative logit, cumulative
negative log-log, and cumulative probit.
Correlation Structure for Generalized Estimating Equations. When measurements within subjects are assumed correlated, the correlation structure may be specified as independent, AR(1), exchangeable, fixed, m-dependent, or unstructured.
Estimated Marginal Means. Estimated marginal means may be computed for one or more crossed factors and may be based on either the response or the linear predictor.
Basic Specification
- The basic specification is a
MODEL
subcommand with one or more model effects and a variable list identifying the dependent variable, the factors (if any), and the covariates (if any). - If the
MODEL
subcommand is not specified, or is specified with no model effects, then the default model is the intercept-only model using the normal distribution and identity link. - If the
REPEATED
subcommand is not specified, then subjects are assumed to be independent. - If the
REPEATED
subcommand is specified, then generalized estimating equations, which model correlations within subjects, are fit. By default, generalized estimating equations use the independent correlation structure. - The basic specification displays default output, including a case processing summary table, variable information, model information, goodness of fit statistics, model summary statistics, and parameter estimates and related statistics.
Syntax Rules
- The dependent variable, or an events/trials specification is required. All other variables and subcommands are optional.
- It is invalid to specify a dependent variable and
an events/trials specification in the same
GENLIN
command. - Multiple
EMMEANS
subcommands may be specified; each is treated independently. All other subcommands may be specified only once. - The
EMMEANS
subcommand may be specified without options. All other subcommands must be specified with options. - Each keyword may be specified only once within a subcommand.
- The command name, all subcommand names, and all keywords must be spelled in full.
- Subcommands may be specified in any order.
- Within subcommands, keyword settings may be specified in any order.
- The following variables,
if specified, must be numeric: events and trials variables, covariates,
OFFSET
variable, andSCALEWEIGHT
variable. The following, if specified, may be numeric or string variables: the dependent variable, factors,SUBJECT
variables, andWITHINSUBJECT
variables. - All variables
must be unique within and across the following variables or variable
lists: the dependent variable, events variable, trials variable, factor
list, covariate list,
OFFSET
variable, andSCALEWEIGHT
variable. - The dependent variable, events variable, trials
variable, and covariates may not be specified as
SUBJECT
orWITHINSUBJECT
variables. -
SUBJECT
variables may not be specified asWITHINSUBJECT
variables. - The minimum syntax is a dependent variable. This specification fits an intercept-only model.
Case Frequency
- If an
WEIGHT
variable is specified, then its values are used as frequency weights by theGENLIN
procedure. - Weight values are rounded to the nearest whole numbers before use. For example, 0.5 is rounded to 1, and 2.4 is rounded to 2.
- The
WEIGHT
variable may not be specified on any subcommand in theGENLIN
procedure. - Cases with missing weights or weights less than 0.5 are not used in the analyses.
Limitations
- The
TO
keyword is not supported on variable lists in this procedure.