CLEM Functions for Missing Values
There are several functions used to handle missing values. The following functions are often used in Select and Filler nodes to discard or fill missing values:
- count_nulls(LIST)
- @BLANK(FIELD)
- @NULL(FIELD)
- undef
The @ functions can be used in conjunction with the @FIELD function to identify the presence of blank or null values in one or more fields. The fields can simply be flagged when blank or null values are present, or they can be filled with replacement values or used in a variety of other operations.
You can count nulls across a list of fields, as follows:
count_nulls(['cardtenure' 'card2tenure' 'card3tenure'])
When using any of the functions that accept a list of fields as input, the special functions @FIELDS_BETWEEN and @FIELDS_MATCHING can be used, as shown in the following example:
count_nulls(@FIELDS_MATCHING('card*'))
You can use the undef function to fill fields with the system-missing value, displayed as $null$. For example, to replace any numeric value, you could use a conditional statement, such as:
if not(Age > 17) or not(Age < 66) then undef else Age endif
This replaces anything that is not in the range with a system-missing value, displayed as $null$. By using the not() function, you can catch all other numeric values, including any negatives. See the topic Functions Handling Blanks and Null Values for more information.
Note on Discarding Records
When using a Select node to discard records, note that syntax uses three-valued logic and automatically includes null values in select statements. To exclude null values (system-missing) in a select expression, you must explicitly specify this by using and not in the expression. For example, to select and include all records where the type of prescription drug is Drug C, you would use the following select statement:
Drug = 'drugC' and not(@NULL(Drug))
Earlier versions of excluded null values in such situations.