Linear Elastic Net Regression

Linear Elastic Net uses the Python sklearn.linear_model.ElasticNet class to estimate regularized linear regression models for a dependent variable on one or more independent variables. Regularization combines L1 (Lasso) and L2 (Ridge) penalties. The extension includes optional modes to display trace plots for different values of alpha for a given L1 ratio, and to select the L1 ratio and alpha hyperparameter values based on crossvalidation. When a single model is fitted or crossvalidation is used to select the penalty ratio and/or alpha, a partition of holdout data can be used to estimate out-of-sample performance.

In addition to fitting a model with specified values of the ratio of L1 penalty and alpha regularization parameter, linear elastic net can display a trace plot of coefficient values for a range of alpha values for a given ratio, or facilitate choice of the hyperparameters value via k-fold crossvalidation on specified grids of values. If a single model is fitted or ratio and/or alpha selection via crossvalidation is performed, the final model can be applied to held-out data that is created by a partition of the input data to obtain a valid estimate of out-of-sample performance of the model.

Obtaining a Linear Elastic Net Regression analysis

  1. From the menus choose:

    Analyze > Regression > Linear OLS Alternatives > Elastic Net

    The Variables dialog allows you to specify a variable that assigns each case in the active dataset to the training or holdout sample.

  2. Select a numeric target variable. Only one target variable is required to run an analysis.
  3. Specify a numeric dependent.
  4. Specify at least one categorical factor variable or numeric covariate variable.

Optionally, Partition provides a way to create a holdout or test subset of the input data for estimation of out-of-sample performance of the specified or chosen model. All partitioning is performed after listwise deletion of any cases with invalid data for any variable used by the procedure. Note that for crossvalidation, folds or partitions of the training data are created in Python. The holdout data that is created by the partition is not used in estimation, regardless of the mode in effect.

The partition can be defined either by specifying the ratio of cases that are randomly assigned to each sample (under Training and Holdout partitions), or by a variable that assigns each case to the training or holdout sample. You cannot specify both training and variables. If the partition is not specified, a holdout sample is created of approximately 30% of the input data is created.

The Training % specifies the relative number of cases in the active dataset to randomly assign to the training sample. The default training is 70%.

This procedure pastes LINEAR_ELASTIC_NET command syntax.