Overview (GENLIN command)
The GENLIN procedure fits the generalized linear model and generalized estimating
equations.
The generalized linear model includes one dependent variable and usually one or more independent effects. Subjects are assumed to be independent. The generalized linear model covers not only widely used statistical models such as linear regression for normally distributed responses, logistic models for binary data, and loglinear models for count data, but also many other statistical models via its very general model formulation. However, the independence assumption prohibits the model from being applied to correlated data.
Generalized estimating equations extend the generalized linear model to correlated longitudinal data and clustered data. More particularly, generalized estimating equations model correlations within subjects. Data across subjects are still assumed independent.
Options
Independence Assumption. The GENLIN procedure fits either the generalized
linear model assuming independence across subjects, or generalized
estimating equations assuming correlated measurements within subjects
but independence across subjects.
Events/Trials Specification for Binomial Distribution. The typical dependent variable specification will be a single variable, but for the binomial distribution the dependent variable may be specified using a number-of-events variable and a number-of-trials variable. Alternatively, if the number of trials is the same across all subjects, then trials may be specified using a fixed number instead of a variable.
Probability Distribution of Dependent Variable. The probability distribution of the dependent variable may be specified as normal, binomial, gamma, inverse Gaussian, multinomial, negative binomial, Poisson, or Tweedie.
Link Function.
GENLIN offers the following link functions: Identity,
complementary log-log, log, log-complement, logit, negative binomial,
negative log-log, odds power, power, and probit. For the multinomial
distribution, the following link functions are available: cumulative
Cauchit, cumulative complementary log-log, cumulative logit, cumulative
negative log-log, and cumulative probit.
Correlation Structure for Generalized Estimating Equations. When measurements within subjects are assumed correlated, the correlation structure may be specified as independent, AR(1), exchangeable, fixed, m-dependent, or unstructured.
Estimated Marginal Means. Estimated marginal means may be computed for one or more crossed factors and may be based on either the response or the linear predictor.
Basic Specification
- The basic specification is a
MODELsubcommand with one or more model effects and a variable list identifying the dependent variable, the factors (if any), and the covariates (if any). - If the
MODELsubcommand is not specified, or is specified with no model effects, then the default model is the intercept-only model using the normal distribution and identity link. - If the
REPEATEDsubcommand is not specified, then subjects are assumed to be independent. - If the
REPEATEDsubcommand is specified, then generalized estimating equations, which model correlations within subjects, are fit. By default, generalized estimating equations use the independent correlation structure. - The basic specification displays default output, including a case processing summary table, variable information, model information, goodness of fit statistics, model summary statistics, and parameter estimates and related statistics.
Syntax Rules
- The dependent variable, or an events/trials specification is required. All other variables and subcommands are optional.
- It is invalid to specify a dependent variable and
an events/trials specification in the same
GENLINcommand. - Multiple
EMMEANSsubcommands may be specified; each is treated independently. All other subcommands may be specified only once. - The
EMMEANSsubcommand may be specified without options. All other subcommands must be specified with options. - Each keyword may be specified only once within a subcommand.
- The command name, all subcommand names, and all keywords must be spelled in full.
- Subcommands may be specified in any order.
- Within subcommands, keyword settings may be specified in any order.
- The following variables,
if specified, must be numeric: events and trials variables, covariates,
OFFSETvariable, andSCALEWEIGHTvariable. The following, if specified, may be numeric or string variables: the dependent variable, factors,SUBJECTvariables, andWITHINSUBJECTvariables. - All variables
must be unique within and across the following variables or variable
lists: the dependent variable, events variable, trials variable, factor
list, covariate list,
OFFSETvariable, andSCALEWEIGHTvariable. - The dependent variable, events variable, trials
variable, and covariates may not be specified as
SUBJECTorWITHINSUBJECTvariables. -
SUBJECTvariables may not be specified asWITHINSUBJECTvariables. - The minimum syntax is a dependent variable. This specification fits an intercept-only model.
Case Frequency
- If an
WEIGHTvariable is specified, then its values are used as frequency weights by theGENLINprocedure. - Weight values are rounded to the nearest whole numbers before use. For example, 0.5 is rounded to 1, and 2.4 is rounded to 2.
- The
WEIGHTvariable may not be specified on any subcommand in theGENLINprocedure. - Cases with missing weights or weights less than 0.5 are not used in the analyses.
Limitations
- The
TOkeyword is not supported on variable lists in this procedure.