Overview (MATRIX DATA command)

MATRIX DATA reads raw matrix materials and converts them to a matrix data file that can be read by procedures that handle matrix materials. The data can include vector statistics, such as means and standard deviations, as well as matrices.

MATRIX DATA is similar to a DATA LIST command: it defines variable names and their order in a raw data file. However, MATRIX DATA can read only data that conform to the general format of matrix data files.

Matrix Files

Like the matrix data files created by procedures, the file that MATRIX DATA creates contains the following variables in the indicated order. If the variables are in a different order in the raw data file, MATRIX DATA rearranges them in the active dataset.

  • Split-file variables. These optional variables define split files. There can be up to eight split variables, and they must have numeric values. Split-file variables will appear in the order in which they are specified on the SPLIT subcommand.
  • ROWTYPE_. This is a string variable with A8 format. Its values define the data type for each record. For example, it might identify a row of values as means, standard deviations, or correlation coefficients. Every matrix data file has a ROWTYPE_ variable.
  • Factor variables. There can be any number of factors. They occur only if the data include within-cells information, such as the within-cells means. Factors have the system-missing value on records that define pooled information. Factor variables appear in the order in which they are specified on the FACTORS subcommand.
  • VARNAME_. This is a string variable with A8 format. MATRIX DATA automatically generates VARNAME_ and its values based on the variables named on VARIABLES. You never enter values for VARNAME_. Values for VARNAME_ are blank for records that define vector information. Every matrix in the program has a VARNAME_ variable.
  • Continuous variables. These are the variables that were used to generate the correlation coefficients or other aggregated data. There can be any number of them. Continuous variables appear in the order in which they are specified on VARIABLES.

Options

Data Files. You can define both inline data and data in an external file.

Important: Inline data are limited to 1024 bytes per line; any data past that limit will be truncated. Consider moving the inline data to an external file and using MATRIX DATA FILE=<path/file> to avoid any possible data loss.

Data Format. By default, data are assumed to be entered in freefield format with each vector or row beginning on a new record (the keyword LIST on the FORMAT subcommand). If each vector or row does not begin on a new record, use the keyword FREE. You can also use FORMAT to indicate whether matrices are entered in upper or lower triangular or full square or rectangular format and whether or not they include diagonal values.

Variable Types. You can specify split-file and factor variables using the SPLIT and FACTORS subcommands. You can identify record types by specifying ROWTYPE_ on the VARIABLES subcommand if ROWTYPE_ values are included in the data or by implying ROWTYPE_ values on CONTENTS.

Basic Specification

The basic specification is VARIABLES and a list of variables. Additional specifications are required as follows:

  • FILE is required to specify the data file if the data are not inline.
  • If data are in any format other than lower triangular with diagonal values included, FORMAT is required.
  • If the data contain values in addition to matrix coefficients, such as the mean and standard deviation, either the variable ROWTYPE_ must be specified on VARIABLES and ROWTYPE_ values must be included in the data or CONTENTS must be used to describe the data.
  • If the data include split-file variables, SPLIT is required. If there are factors, FACTORS is required.

Specifications on most MATRIX DATA subcommands depend on whether ROWTYPE_ is included in the data and specified on VARIABLES or whether it is implied using CONTENTS.

Table 1. Subcommand requirements in relation to ROWTYPE_
Subcommand Implicit ROWTYPE_ using CONTENTS Explicit ROWTYPE_ on VARIABLES
FILE Defaults to INLINE Defaults to INLINE
VARIABLES Required Required
FORMAT Defaults to LOWER DIAG Defaults to LOWER DIAG
SPLIT Required if split files* Required if split files
FACTORS Required if factors Required if factors
CELLS Required if factors Inapplicable
CONTENTS Defaults to CORR Optional
N Optional Optional

* If the data do not contain values for the split-file variables, this subcommand can specify a single variable, which is not specified on the VARIABLES subcommand.

Subcommand Order

  • SPLIT and FACTORS, when used, must follow VARIABLES.
  • The remaining subcommands can be specified in any order.

Syntax Rules

  • No commands can be specified between MATRIX DATA and BEGIN DATA, not even a VARIABLE LABELS or FORMAT command. Data transformations cannot be used until after MATRIX DATA is executed.