Add variables

Add variables merges the active data file with another open data file or external IBM® SPSS® Statistics data file that contains the same cases (rows) but different variables (columns). For example, you might want to merge a data file that contains pre-test results with one that contains post-test results.

Merge method

The default merge behavior depends on how variables are structured in the selected datasets.

One-to-one (file order)

Order of cases in the files determines how cases are matched. This option should be selected when no variables exist with the same name and basic type (string or numeric) in both files. This setting generates MATCH FILES command syntax in the following format (where [name] is either a dataset or external file specification enclosed in quotation marks):

MATCH FILES FILE=* /FILE="[name]"

One-to-one (key values)

Cases are matched based on the values of one or more key variables. This is the default setting when one or more variables have the same name and basic type (string or numeric) in both files. This setting generates MATCH FILES command syntax in the following format (where [name] is either a dataset or external file specification (enclosed in quotation marks)):

MATCH FILES FILE=* /FILE="[name]" /BY [key varlist]

One-to-many merge (key values)

One file contains case data, and one file is a lookup table. Cases from the lookup table are merged with cases with matching key values in the case data file. The same key value can occur multiple times in the case data file. One case in the lookup table can be merged with many cases in the case data file.

All cases in the case data file are included in the merged file.
Cases in the table lookup file that do not have cases with matching key values in the case data file not included.
The table lookup file cannot contain duplicate key values. If the files have multiple key variables, the key value is the combination of those values.

This setting generates MATCH FILES syntax with a TABLE subcommand for the lookup table.

Pair variables

Lookup table

A lookup table in an array that replaces runtime computation with a simple array indexing operation.

Note: The lookup table settings are available only when One-to-many merge (key values) is selected as the merge method.

[active dataset name] - The active dataset name.

This is the default selection.
When selected, this setting generates syntax in the following format (where [name] is either a dataset or external file specification (enclosed in quotation marks)):
```
MATCH FILES TABLE=* /FILE=”[name]” /BY [key varlist]
```

[second dataset name or file name] - The name of the second dataset or file.

By default, this option is not selected.
When the file is external, only the file name (not the entire path) is provided (the full path is included in the generated syntax).
When the second file is an external file (and sorting is selected, or it has string keys that require a change in the defined length), the file must first be opened and assigned a unique name.
When the second file is an external file that needs to be opened (based on the previous condition), and the active dataset is unnamed, the active dataset must be named before the second data file is opened.
When the second file is opened for sorting or changing string key lengths, it remains open after the merge action. Because the file was changed, you are prompted to save changes when the file is closed.
When selected, this setting generates syntax in the following format (where [name] is either a dataset or external file specification (enclosed in quotation marks)):
```
MATCH FILES FILE=* /TABLE=”[name]” /BY [key varlist]
```

Select variables

For key value merges, variables with the same name and basic data type (string or numeric) are included as key variables by default (indicated by the Key icon

icon).

Notes:

The selected merge always generates MATCH FILES command syntax (never STAR JOIN syntax).
SORT CASES and ALTER TYPE command syntax takes precedence over MATCH FILES command syntax.
DROP subcommand syntax, and optional RENAME subcommand syntax, is included based on selections that were made.
BY subcommand syntax is included if one of the key values merge options is selected.
When string keys have different defined lengths, ALTER TYPE syntax is automatically generated to ensure equally defined lengths.

Sort files by key values before merging

For key value merges, both files must be sorted in order of the key variable values. The setting is enabled by default

If one of the files is an external file, it is opened and sorted. The sorted file is not saved unless you explicitly save the file.
If the files are already sorted, you can save time by clearing this option.
When selected, this setting generates SORT CASES syntax.

Pairing key variables

You can create key pairs from unpaired key variables and include them in the new, merged dataset. To create new key variable pairings:

From the menus choose:
Data > Merge files...
Select Add variables.
Select a dataset to merge with the active dataset. You can drag an existing file to the Drag-and-drop dataset file (.sav) here area in the user interface, browse for an existing dataset, or select an open dataset.
Select the appropriate merge method (One-to-one (file order), One-to-one (key values), or One-to-many merge (key values)) and click Next.
By default, all key variables that match both the name and the data type (numeric or string) are paired. Only key variables that display in the variable lists are used in the merged dataset.
When more variable pairing are required, click the vertical ellipsis ⋮ control near the top of the Select variables table and select a dataset name from the drop-down list.
Note: In instances where there are no automatic pairings, the vertical ellipsis ⋮ control is the only available option.
Select additional variables to pair with the variables from the other dataset. You can use the Find variables field when the dataset contains many variables. Click OK to add the selected variables to the Select variables list.
The selected variables display with the missing identifier in the adjacent dataset's variable list.
Click Find a key, select an appropriate variable to pair with the adjacent dataset variable, and then click OK. The variables now indicate that they are paired between the two datasets.
Repeat the previous step for all variables you selected in step 6. As you pair key variables, the New dataset preview pane updates to reflect the merged dataset's column and row layout.
Verify that the key variable pairings are correct, both in the variable lists and the New dataset preview pane, and click Create dataset.
The merged dataset opens in a new tab with a name similar to unnamed [DatasetGenerated1]. To save the newly merged dataset, select File > Save or File > Save as....

Unpairing key variables

You can remove key variable pairings when you do not want them to be included in the merged file. To remove key variable pairings:

From the menus choose:
Data > Merge files...
Select a dataset to merge with the active dataset. You can drag an existing file to the Drag-and-drop dataset file (.sav) here area in the user interface, browse for an existing dataset, or select an open dataset.
Select Add variables.
You can remove any variable pairings by hovering the cursor over the key icon between paired key variables. The icon changes to the unpair icon . Click the icon to unpair the key variables. The first dataset still contains the unpaired key variable.
The unpaired variable in the first dataset is included in the merged dataset unless you remove the unpaired key variable.
Click the unpaired key variable in the first dataset and click the Remove icon (x).
Repeat steps 4 and 5 for all key variable pairings that you want to remove.
Verify that the key variable pairings are correct, both in the variable lists and the New dataset preview pane, and click Create dataset.
The merged dataset opens in a new tab with the name unnamed [DatasetGenerated1]. To save the newly merged dataset, select File > Save or File > Save as....