Overview (PLS command)

The PLS procedure estimates partial least squares regression models. Partial least squares is a predictive technique that is an alternative to ordinary least squares (OLS) regression, canonical correlation, or structural equation modeling for analysis of systems of independent and response variables. It is particularly useful when predictor variables are highly correlated or when the number of predictors exceeds the number of cases.

PLS combines features of principal components analysis and multiple regression. It first extracts a set of latent factors that explains as much of the covariance as possible between the independent and dependent variables. Then a regression step predicts values of the dependent variables using the decomposition of the independent variables.

Partial least squares regression is also known as "Projection to Latent Structure".

Options

Response Variables. PLS estimates univariate and multivariate models. If you specify one or more categorical dependent variables, a classification model is estimated. If you specify one or more scale dependent variables, a regression model is estimated. Mixed regression and classification models are supported.

Predictors. Predictors can be categorical or continuous variables. Both main effects and interaction terms can be estimated.

Method. You can specify the maximum number of latent factors to extract. By default, five latent factors are extracted.

Export. You can save casewise, factorwise, and predictorwise model results to IBM® SPSS® Statistics datasets.

Basic Specification

  • PLS is an extension command and will not be recognized by the system until you use the EXTENSION command to add PLS to the command table. The syntax diagram for PLS is defined in plscommand.xml, which is installed in the \extensions subdirectory of the main installation directory. See the topic EXTENSION for more information.
  • The minimum specification is one or more dependent variables and one or more predictors.
  • The procedure displays the following tables: proportion of variance explained (by latent factor), latent factor weights, latent factor loadings, independent variable importance in projection (VIP), and regression parameter estimates (by dependent variable).

Operations

  • All model variables are centered and standardized, including indicator variables representing categorical variables.
  • If a WEIGHT variable is specified, its values are used as frequency weights. Weight values are rounded to the nearest whole number before use. Cases with missing weights or weights less than 0.5 are not used in the analyses.
  • User- and system-missing values are treated as invalid.
  • Memory allocated via SET WORKSPACE is unavailable to extension commands; when running PLS on large datasets, you may actually lower the size of your workspace.

Syntax Rules

  • The PLS command is required. All subcommands are optional.
  • Only a single instance of each subcommand is allowed.
  • An error occurs if an attribute or keyword is specified more than once within a subcommand.
  • Equals signs and parentheses shown in the syntax chart are required.
  • Subcommand names and keywords must be spelled in full.
  • Empty subcommands are not allowed.