Functions available for missing values

Different methods are available for dealing with missing values in your data. You may choose to use functionality available in Data Refinery or in SPSS Modeler nodes.

Functions available in SPSS Modeler

In SPSS Modeler, there are several functions used to handle missing values. The following functions are often used in Select and Filler nodes to discard or fill missing values:

  • count_nulls(LIST
  • @BLANK(FIELD
  • @NULL(FIELD
  • undef

You can use @ functions in conjunction with the @FIELD function to identify the presence of blank or null values in one or more fields. Simply flag the fields when blank or null values are present, or fill them with replacement values or use them in a variety of other operations.

You can count nulls across a list of fields, as follows:

count_nulls(['cardtenure' 'card2tenure' 'card3tenure'])

When using any of the functions that accept a list of fields as input, you can use the special functions @FIELDS_BETWEEN and @FIELDS_MATCHING, as shown in the following example:

count_nulls(@FIELDS_MATCHING('card*'))

You can use the undef function to fill fields with the system-missing value, displayed as $null$. For example, to replace any numeric value, you can use a conditional statement, such as:

if not(Age > 17) or not(Age < 66) then undef else Age endif

This replaces anything that isn't in the range with a system-missing value, displayed as $null$. By using the not() function, you can catch all other numeric values, including any negatives.

Note on discarding records: When using a Select node to discard records, note that syntax uses three-valued logic and automatically includes null values in select statements. To exclude null values (system-missing) in a select expression, you must explicitly specify this by using and not in the expression. For example, to select and include all records where the type of prescription drug is Drug C, you would use the following select statement:
Drug = 'drugC' and not(@NULL(Drug))

Functions available in Data Refinery

You can also use Data Refinery to handle missing values. See the following information.