Table of contents

GLE node

The GLE model identifies the dependent variable that is linearly related to the factors and covariates via a specified link function. Moreover, the model allows for the dependent variable to have a non-normal distribution. It covers widely used statistical models, such as linear regression for normally distributed responses, logistic models for binary data, loglinear models for count data, complementary log-log models for interval-censored survival data, plus many other statistical models through its very general model formulation.

Examples. A shipping company can use generalized linear models to fit a Poisson regression to damage counts for several types of ships constructed in different time periods, and the resulting model can help determine which ship types are most prone to damage.

A car insurance company can use generalized linear models to fit a gamma regression to damage claims for cars, and the resulting model can help determine the factors that contribute the most to claim size.

Medical researchers can use generalized linear models to fit a complementary log-log regression to interval-censored survival data to predict the time to recurrence for a medical condition.

GLE models work by building an equation that relates the input field values to the output field values. Once the model is generated, it can be used to estimate values for new data.

For a categorical target, for each record, a probability of membership is computed for each possible output category. The target category with the highest probability is assigned as the predicted output value for that record.

Requirements. You need one or more input fields and exactly one target field (which can have a measurement level of Continuous, Categorical, or Flag) with two or more categories. Fields used in the model must have their types fully instantiated.

Note: When first creating a flow, you select which runtime to use. By default, flows use the IBM SPSS Modeler runtime. If you want to use native Spark algorithms instead of SPSS algorithms, select the Spark runtime. Properties for this node will vary depending on which runtime option you choose.