Examples (LINEAR_RIDGE extension command)

LINEAR_RIDGE y WITH x1 TO x5
  /ALPHA=.75.
  • A ridge regression is fitted of y on standardized versions of the covariate list x1 TO x5, which includes the variables x1, x5, and any variables in the active dataset in between x1 and x5.
  • The regularization strength parameter ALPHA is set to .75, meaning less regularization than the default value of 1.
  • Input data are partitioned using a pseudo-random 70-30 split.
LINEAR_RIDGE y WITH x1 x2 z1 z2
  /PLOT RESIDUALS
  /SAVE PRED RESID.
  • A ridge regression is fitted of y on standardized versions of x1, x2, z1, and z2.
  • The alpha regularization parameter is left at the default value of 1.
  • A scatterplot of residuals vs. predicted values is displayed.
  • Predicted values and residuals are saved, using default names.
  • Input data are partitioned using a pseudo-random 70-30 split.
LINEAR_RIDGE y WITH x1 x2 x3
  /MODE = TRACE
  /ALPHA VALUES = -3 TO 2 BY .25 METRIC = LG10
  /PARTITION TRAINING = 3 HOLDOUT = 1.
  • A series of ridge regression models are fitted regressing y on standardized versions of x1, x2, and x3.
  • Instead of tabular output, plots of regression coefficients, mean squared error (MSE), and R2 vs. alpha for the training data are displayed.
  • Alpha begins at 10-3 and ends at 102, with intermediate values every 10.25 units between those values (i.e., 10-2.75, 10-2.5, … , 101.5, 101.75).
  • The training data contains a pseudo-random selection of approximately 75% of the input data.
  • Holdout data are not used, since no single or final model is fitted.
LINEAR_RIDGE y BY group WITH x1 x2
  /MODE = CROSSVALID
  /ALPHA = .01 TO 2 BY .01
  /CRITERIA NFOLDS = 10 TIMER = 15
  /PARTITION TRAINING = 70 HOLDOUT = 30
  /PRINT COMPARE
  /SAVE PRED RESID.
  • A series of ridge regression models are fitted, regressing y on standardized versions of indicators representing observed categories of group, as well as x1 and x2.
  • With 200 values of alpha and 10 crossvalidation folds, a total of 2000 cycles of fitting and scoring are performed in selecting a value for alpha.
  • The TIMER specification on the CRITERIA subcommand allows 15 minutes for the entire process.
  • Approximately 70% of the input data is used in the alpha selection process, and the remaining 30% is scored after alpha is selected, based on the model fitted to the entire training subset.
  • Each model is estimated ten times and the average crossvalidation R2 is used to assess model accuracy.
  • Tabular output includes summary results for all fitted models, sorted in descending order by the average R2 over the crossvalidation folds, as well as scoring results for the holdout test data.
  • Predicted values and residuals based on the chosen alpha value are saved for all cases (training and holdout).