IBM Data WH Generalized Linear Model Options - General

On the Model Options tab, you can choose whether to specify a name for the model, or generate a name automatically. You can also make various settings relating to the model, the link function, the input field interactions (if any), and set default values for scoring options.

Model name. You can generate the model name automatically based on the target or ID field (or model type in cases where no such field is specified) or specify a custom name.

Field options. You can specify the roles of the input fields for building the model.

General Settings. These settings relate to the stopping criteria for the algorithm.

  • Maximum number of iterations. The maximum number of iterations the algorithm will perform; minimum is 1, default is 20.
  • Maximum error (1e). The maximum error value (in scientific notation) at which the algorithm should stop finding the best fit model. Minimum is 0, default is -3, meaning 1E-3, or 0.001.
  • Insignificant error values threshold (1e). The value (in scientific notation) below which errors are treated as having a value of zero. Minimum is -1, default is -7, meaning that error values below 1E-7 (or 0.0000001) are counted as insignificant.

Distribution Settings. These settings relate to the distribution of the dependent (target) variable.

  • Distribution of response variable. The distribution type; one of Bernoulli (default), Gaussian, Poisson, Binomial, Negative binomial, Wald (Inverse Gaussian), and Gamma.
  • Parameters. (Poisson or binomial distribution only) You must specify one of the following options in the Specify parameter field:
    • To automatically have the parameter estimated from data, select Default.
    • To allow optimization of the distribution quasi-likelihood, select Quasi.
    • To explicitly specify the parameter value, select Explicit.

    (Binomial distribution only) You must specify the input table column that is to be used as the trials field as required by binomial distribution. This column contains the number of trials for the binomial distribution.

    ( Negative binomial distribution only) You can use the default of -1 or specify a different parameter value.

Link Function Settings. These settings relate to the link function, which relates the dependent variable to the predictor variables.

  • Link function. The function to be used; one of Identity, Inverse, Invnegative, Invsquare, Sqrt, Power, Oddspower, Log, Clog, Loglog, Cloglog, Logit (default), Probit, Gaussit, Cauchit, Canbinom, Cangeom, Cannegbinom.
  • Parameters. (Power or Oddspower link functions only) You can specify a parameter value if the link function is Power or Oddspower. Choose to either specify a value, or use the default of 1.