Examples (LINEAR_ELASTIC_NET extension command)

LINEAR_ELASTIC_NET y WITH x1 TO x5
  /ALPHA=.75.
  • A penalized linear regression is fitted of y on standardized versions of the covariate list x1 TO x5, which includes the variables x1, x5, and any variables in the active dataset in between x1 and x5.
  • The regularization strength parameter ALPHA is set to .75, meaning less regularization than the default value of 1.
  • Since no RATIO subcommand is specified, the default penalty mixture of .5 is used.
  • Input data are partitioned using a pseudo-random 70-30 split.
LINEAR_ELASTIC_NET y WITH x1 x2 z1 z2
  /PLOT RESIDUALS
  /SAVE PRED RESID.
  • A penalized linear regression is fitted of y on standardized versions of x1, x2, z1, and z2.
  • The penalty mixture parameter is left at the default value of .5.
  • The alpha regularization parameter is left at the default value of 1.
  • A scatterplot of residuals vs. predicted values is displayed.
  • Predicted values and residuals are saved, using default names.
  • Input data are partitioned using a pseudo-random 70-30 split.
LINEAR_ELASTIC_NET y WITH x1 x2 x3
  /MODE = TRACE
  /RATIO = .8
  /ALPHA VALUES = -3 TO 2 BY .25 METRIC = LG10
  /PARTITION TRAINING = 3 HOLDOUT = 1.
  • A series of penalized regression models are fitted regressing y on standardized versions of x1, x2, and x3.
  • Instead of tabular output, plots of regression coefficients, mean squared error (MSE), and R2 vs. alpha for the training data are provided.
  • All models involve a penalty mixture ratio of 80% L1 or Lasso penalty.
  • Alpha begins at 10-3 and ends at 102, with intermediate values every 10.25 units between those values (i.e., 10-2.75, 10-2.5, … , 101.5, 101.75).
  • The training data contains a pseudo-random selection of approximately 75% of the input data.
  • Holdout data are not used, since no single or final model is fitted.
LINEAR_ELASTIC_NET y BY group WITH x1 x2
  /MODE = CROSSVALID
  /RATIO = .01 TO .99 by .02
  /ALPHA = .01 TO 2 BY .01
  /CRITERIA NFOLDS = 10 TIMER = 20
  /PARTITION TRAINING = 70 HOLDOUT = 30
  /PRINT BEST
  /SAVE PRED RESID.
  • A series of penalized regression models are fitted, regressing y on standardized versions of indicators representing observed categories of group, as well as x1 and x2.
  • With 50 values of ratio, 200 values of alpha and 10 crossvalidation folds, a total of 100,000 cycles of fitting and scoring are performed in selecting a value for alpha.
  • The TIMER specification on the CRITERIA subcommand allows 20 minutes for the entire process.
  • Approximately 70% of the input data is used in the alpha selection process, and the remaining 30% is scored after alpha is selected, based on the model fitted to the entire training subset.
  • Each model is estimated ten times and the average crossvalidation R2 is used to assess model accuracy.
  • Tabular output includes summary results for the chosen model, including a table of regression coefficients, as well as scoring results for the holdout test data.
  • Predicted values and residuals based on the chosen alpha value are saved for all cases (training and holdout).