Overview (PROXMAP command)

In PROXMAP, an iterative majorization algorithm ensures monotonic convergence. Proximities may be transformed using monotonic functions. Nonlinear mapping and optimal transformation are used to achieve maximum dimensionality reduction.

Input data may include numeric, ordinal, and nominal (categorical) variables. PROXMAP can display both objects (cases) and variables in a joint space (biplot). Additional attributes can be used to constrain the solution, and supplementary properties can be used to interpret the resulting configuration.

Options

Data Input

PROXMAP operates on one or more square matrices representing proximities between objects. These matrices may either be directly provided in the input data file or generated by PROXMAP from multivariate input data. In both scenarios, the input is defined by a variable list or lists for multiple sources corresponding to each proximity matrix, allowing for multiple data sources. For proximities, the number of variables is equal to the number of cases (objects).

In addition to the proximity or multivariate data, you can specify variables as proximity (cell) weights per source and as object (case) weights (currently for proximity data only).

Furthermore, you can specify variables as initial configuration, attributes, and properties.

Derived Proximities
The computation of proximities from multivariate data depends on the measurement level and must be given in the Data Entry window (.sav file).

The proximities are computed as the mean square root of the squared distances between objects per variable. When multiple sources are specified in multivariate data, we also obtain combined proximities. Here the same procedure is applied as for a single source: the mean square root is taken of the squared distances between objects in all sources.

For each variable the measurement level is taken into account. This level can be Scale (for TYPE Numeric), Ordinal, or Nominal (for TYPE both String and Numeric).

The Measurement level can be changed in the Data Entry window (Variable View), or via syntax:
VARIABLE LEVELS varlist {scale    }  varlist {...}… .
                        {ordinal  }
                        {nominal}

If string type variables contain numbers for all cases, the Type can be altered to Numeric (Use the command ALTER TYPE varlist (F8.3)), before you run PROXMAP.

Alternatively, you can use AUTORECODE to convert string variables into numeric variables.

Numeric variables define a Euclidean distance function, which may be applied to either standardized or raw data. For nominal variables, Chi-squared distances are used.

For ordinal variables, distances between objects are computed as follows: Order the objects based on their ordinal values. Compute Chi-squared distances between each pair of adjacent objects in the ordered list: Between the first and second object, between the second and third object, and so on, up to the pair consisting of the (N–1)th and Nth object, where N is the total number of objects. To determine the distance between any two non-adjacent objects, sum the distances between all intermediate adjacent pairs along the ordered path. For example, the distance between the first and third object is the sum of the distances from the first to the second, and from the second to the third.

Weights

With the PROXIMITIES data type, proximity (cell) weights and case weights can be specified. In addition to these user-provided weights, special weights can be requested that are a function of the proximities, with a choice from two kinds of functions.

The weights for object i and j used in the analysis and shown in the output (referred to as derived weights) are a product of three components:

wij = vij × uij × √cicj
where,

vij is the proximity weight specified with the SOURCEID subcommand

uij is the special weight based on the proximity value, as defined via the WEIGHTS subcommand

cici is the product of two case weights specified with the OBJECTID subcommand.

The derived weights, wij, are normalized such that the sum of squares is equal to the number of weights.

Transformations
PROXMAP supports transformations of proximities, attribute variables, and property variables. These transformations enable nonlinear modeling and allow the procedure to accommodate different measurement levels.

For proximities, whether provided directly or derived from multivariate data, you can specify a transformation function using the TRANSFORMATION subcommand. Available transformation types include linear functions (with only an intercept, only a slope, or both), step functions for ordinal data (which can preserve ties using the discrete or secondary approach or not preserve ties using the continuous or primary approach), monotone spline functions, and power functions. Both spline and power transformations may include or exclude an intercept, depending on the selected option. You can apply a single transformation function across all proximity matrices (an unconditional transformation) or specify a separate transformation function for each matrix (a matrix-conditional transformation) using the CONDITION subcommand.

For attribute and property variables, transformations are defined using the ATTRIBUTES and PROPERTIES subcommands. The available transformation functions include linear, ordinal (with the same options for handling ties as in proximity transformation), monotonic and non-monotonic spline functions, and nominal transformation. A nominal transformation of a property variable results in a set of centroids that represent the mean positions of the objects associated with each category. In contrast, a nominal transformation of an attribute variable yields an optimal step function.

Models
By default, PROXMAP uses the IDENTITY model for both PROXIMITIES and MULTIVARIATE data types. You can specify an alternative model using the MODEL subcommand.

When multiple sources are available (either provided directly or generated from multivariate input) PROXMAP supports several individual differences models in addition to the default identity model. These include the Weighted Identity (Dilation) model, the Weighted Euclidean (Diagonal) model, and the Generalized Euclidean model.

Individual differences models in PROXMAP provide both a Common Space, which is shared across all sources, and separate Individual Spaces, one for each data source. When you use the model, MODEL=GENERALIZED, each Individual Space can be independently rotated before the dimension weighting is applied. This flexibility allows for more nuanced modeling of source-specific variation.

The Generalized Euclidean model also supports a reduced rank option. When enabled, the Common Space is allowed to have one more dimension than each of the Individual Spaces. This structure is useful for distinguishing shared components from source-specific variation.

A special application of the reduced rank model occurs when proximities are computed for each variable in multivariate data separately, so each variable defines a source. In this case, all one-dimensional variable proximities are jointly fitted as distances between object points in a two-dimensional Common Space. The Space Weights then define direction coordinates for each original variable in the Common Space, meaning that each variable is represented as a linear combination of the space's dimensions.

Attributes
You can specify attribute variables using the ATTRIBUTES subcommand. Attributes allow you to supervise the mapping by constraining the dimensions of the Common Space to be a linear combination of the attributes. This restricted mapping enables interpretability by using meaningful external variables.

A biplot of the restricted solution can be requested, displaying the configuration together with the (transformed) attribute variables. This biplot helps interpret the dimensions of the Common Space in terms of the specified attributes.

Properties
You can specify one or more property variables using the PROPERTIES subcommand. Property variables are used to assist in the interpretation of the solution by fitting them into the final mapping as supplementary variables. These variables do not affect the configuration itself and are not included in the optimization process. To aid interpretation, you can request a biplot that displays the final mapping together with the (transformed) property variables. The inclusion of supplementary variables in the biplot helps clarify the structure of the Common Space in relation to external information.

Output

Tables
You can request tabular output for various stages and components of the PROXMAP analysis. Available tables include, among others:
  • For input data, the (derived) proximities and, for proximity data, their corresponding weights.
  • For the initial stage, the coordinates of the initial space, and diagnostics for the initial configuration. For multivariate data, coordinates for the directions of the standardized variables in the initial space. When properties are specified, coordinates for the directions of the standardized properties.
  • For the final stage, the output includes the summarized history of the iterations, the coordinates of the common object space, the coordinates of the original variables, and, if specified, coordinates of transformed property variables, along with the variance accounted for (VAF).
  • If individual differences models are used, PROXMAP provides the coordinates of the individual spaces and the corresponding space weights.
  • When attribute variables are specified, tables include the original and transformed attribute values, their correlation matrices with eigenvalues, attribute weights, and the coordinates for the directions for attributes in the common object space.
  • Similarly, for property variables, the output includes the original and transformed values, correlation matrices, and the coordinates for the directions of the transformed properties.
Plots
PROXMAP provides several types of plots to support the interpretation of proximities, configurations, and transformations. You can request plots of the initial and final configurations of object points; these may include overlays such as minimum spanning trees, and neighborhood or threshold graphs.

Additional plots include representations of space (dimension) weights, proximity transformations, model fit and residuals, and transformations and residuals for attribute and property variables.

Biplots can be generated for both the initial and final common space. For the initial space, biplots may include all property variables and, in the case of multivariate input data, all original variables. For the final common space, biplots may include all or a selection of the attribute and property variables, and original variables if the input is multivariate. When the reduced rank-1 Generalized Euclidean model is used, a biplot can also include the space weights representing the variable directions in the common space.

Basic Specification

The minimum requirement for running PROXMAP is the specification of a variable list at the SOURCEID subcommand with the keyword DATA, and a source name with the keyword NAME. If no additional options are set, PROXMAP defaults to a two-dimensional nonmetric mapping solution using the identity model. The input data are assumed to be of type MULTIVARIATE, from which one proximity matrix is derived. This matrix is automatically named $SRC001 and labeled Source 001.

The derived proximities are by default ordinally transformed, with ties allowed to be untied. All proximity weights are set to 1 by default.

For the initial configuration, the procedure by default uses classical scaling (principal coordinate analysis) without estimation of an additive constant. The default output includes a summary of the algorithm execution and results, along with a plot of the initial and final configurations.

Data requirements
  • PROXMAP requires at least three cases (objects) and three variables for both MULTIVARIATE and PROXIMITY data types.
  • For PROXIMITY data, the number of variables must equal the number of cases.
  • If proximity weight variables are specified, the number of weight variables must equal the number of proximity variables.
Missing data
  • For MULTIVARIATE data: Missing values are imputed automatically.
  • For negative and missing PROXIMITY data, the corresponding proximity weights are set to zero.
  • Negative or missing proximity weights are set to zero.
  • If an entire triangle of proximity data and/or weights is missing (either empty or user-defined missing values), it is assumed the matrix is symmetric, and the missing values are replaced with their opposite counterparts.
Measurement level
  • For MULTIVARIATE data, the measurement level of the variables as set in the SPSS datafile (.sav), is important to derive proximities (see Derived Proximities in Overview). For PROXIMITY data variables their measurement level is of no consequence. Transformation of PROXIMITIES, ATTRIBUTES and PROPERTIES is guided by the functions given in the associated Subcommand.
  • Multivariate, attribute, and property variables are centered at the outset, irrespective of the STANDARDIZATION option chosen at the CRITERIA subcommand.
Syntax Rules
  • All subcommands are optional, except SOURCEID, which is required.
  • Subcommands may be specified in any order.
  • The command name PROXMAP and all subcommands and keywords must be spelled in full.
  • To specify multiple sources, the SOURCEID subcommand must be repeated.
  • If multiple sources are defined and CONDITION=MATRIX is specified, the TRANSFORMATIONS subcommand may be repeated, once per source or group of sources.
  • If CONDITION=UNCONDITIONAL is specified, only one TRANSFORMATIONS subcommand is allowed.
  • The ATTRIBUTES and PROPERTIES subcommands may be repeated to apply different transformation functions to different variables.
  • All other subcommands must appear only once.
  • Equal signs (=) shown in the syntax chart are optional and may be omitted.
  • If an individual differences model is specified when there is only one source, a warning is issued, and the model is set to IDENTITY.
Limitations
  • Proximity weights and case weights cannot be used with MULTIVARIATE data.
  • The TO keyword for specifying ranges is not allowed in variable or source lists.
  • String variables are not permitted. Variables must be numeric. String variables may be converted using ALTER TYPE or (AUTO)RECODE.
  • Active WEIGHT, FILTER, or SPLIT FILE commands are ignored by the PROXMAP procedure, and a warning is issued.