Setting options for the Transpose node

In the Transpose method drop-down, select the method you want the Transpose node to perform: Both fields and records, Records to fields, or Fields to records. The settings for each of the three methods are described in the following sections.

Restriction: The Records to fields and Fields to records methods are only supported on Windows 64-bit, Linux 64-bit, and Mac.

Both fields and records

New field names can be generated automatically based on a specified prefix, or read from an existing field in the data.

Use prefix. This option generates new field names automatically based on the specified prefix (Field1, Field2, and so on). You can customize the prefix as needed. With this option, you must specify the number of fields to be created, regardless of the number of rows in the original data. For example, if Number of new fields is set to 100, all data beyond the first 100 rows will be discarded. If there are fewer than 100 rows in the original data, some fields will be null. (You can increase the number of fields as needed, but the purpose of this setting is to avoid transposing a million records into a million fields, which would produce an unmanageable result.)

For example, suppose you have data with series in rows and a separate field (column) for each month. You can transpose this so that each series is in a separate field, with a row for each month.

Read from field. Reads field names from an existing field. With this option, the number of new fields is determined by the data, up to the specified maximum. Each value of the selected field becomes a new field in the output data. The selected field can have any storage type (integer, string, date, and so on), but in order to avoid duplicate field names, each value of the selected field must be unique (in other words, the number of values should match the number of rows). If duplicate field names are encountered, a warning is displayed.

Read Values. If the selected field has not been instantiated, select this option to populate the list of new field names. If the field has already been instantiated, then this step is not necessary.
Maximum number of values to read. When reading fields names from the data, an upper limit is specified in order to avoid creating an inordinately large number of fields. (As noted above, transposing one million records into one million fields would produce an unmanageable result.)

For example, if the first column in your data specifies the name for each series, you can use these values as fields names in the transposed data.

Transpose. By default, only continuous (numeric range) fields are transposed (either integer or real storage). Optionally, you can choose a subset of numeric fields or transpose string fields instead. However, all transposed fields must be of the same storage type—either numeric or string but not both—since mixing the input fields would generate mixed values within each output column, which violates the rule that all values of a field must have the same storage. Other storage types (date, time, timestamp) cannot be transposed.

All numeric. Transposes all numeric fields (integer or real storage). The number of rows in the output matches the number of numeric fields in the original data.
All string. Transposes all string fields.
Custom. Allows you to select a subset of numeric fields. The number of rows in the output matches the number of fields selected. This option is only available for numeric fields.

Row ID name. Specifies the name of the row ID field created by the node. The values of this field are determined by the names of the fields in the original data.

Tip: When transposing time series data from rows to columns, if your original data includes a row, such as date, month, or year, that labels the period for each measurement, be sure to read these labels into IBM® SPSS® Modeler as field names (as demonstrated in the above examples, which show the month or date as field names in the original data, respectively) rather than including the label in the first row of data. This will avoid mixing labels and values in each column (which would force numbers to be read as strings, since storage types cannot be mixed within a column).

Records to fields

Fields. The Fields list contains all fields entering the Transpose node.

Index. Use the Index section to select the fields you want to use as index fields.

Fields. Use the Fields section to select the fields you want to use as fields.

Value. Use the Value section to select the fields you want to use as value fields.

Aggregate function. When there are more than one records for an index, you must aggregate the records into one. Use the Aggregate function drop-down to specify how to aggregate the records using one of the following functions. Note that aggregation impacts all fields.

Mean. Returns the mean values for each key field combination. The mean is a measure of central tendency, and is the arithmetic average (the sum divided by the number of cases).
Sum. Returns summed values for each key field combination. The sum is the total of the values, across all cases with nonmissing values.
Min. Returns minimum values for each key field combination.
Max. Returns maximum values for each key field combination.
Median. Returns the median values for each key field combination. The median is a measure of central tendency that is not sensitive to outlying values (unlike the mean, which can be affected by a few extremely high or low values). Also known as the 50th percentile or 2nd quartile.
Count. Returns the count of non-null values for each key field combination.

Fields to records

Fields. The Fields list contains all fields entering the Transpose node.

Index. Use the Index section to select the fields you want to use as index fields.

Value. Use the Value section to select the fields you want to use as value fields. If you don't select any value fields, then all unassigned numeric fields will be used as values. But if not numeric fields are available, then all unassigned string fields will be used.