word_diff() function

The word_diff() function finds the number of modifications that are required to change one string into another string.

Adding, deleting, substituting, or changing the case of a single character in the string each counts as one modification. Transposing two adjacent characters counts as two modifications in all but the Damerau-Levenshtein algorithm, which counts transposition as a single modification.

Using the word_diff() function with the Soundex or Double Metaphone algorithms achieves the same result as using the combination of the word_key() function to convert the strings into their phonetic encodings and then using the word_keys_diff() function to compare those encodings. The word_diff() function both converts the strings to their phonetic encodings and compares those encodings.

Syntax

The word_diff() function has the following syntax:
int1 = word_diff(varchar word1, varchar word2 [, int algorithm]);

The word1 value specifies the first word in the comparison.

The word2 value specifies the second word in the comparison.

The algorithm value is one of the following values:
0
Soundex-Miracode.
1
Soundex-Simplified.
2
Soundex-SQLServer.
3
Double Metaphone. This is the default.
10
Levenshtein.
11
Damerau-Levenshtein.

The built-in Netezza Performance Server le_dst() function is equivalent to using the word_diff() function with the Levenshtein algorithm. The built-in Netezza Performance Server dle_dst() function is equivalent to using the word_diff() function with the Damerau-Levenshtein algorithm.

Returns

The function returns an integer that indicates how similar or different the two strings are. A value of 0 indicates that the strings are the same. The results vary depending on the algorithm that you choose.

Example

In the following example, the function returns 0 because the Soundex algorithms consider only the initial vowel, not subsequent vowels. :
select word_diff('anderson','andrsn',0);
 WORD_DIFF
-----------
         0
(1 row)
If the algorithm is changed to Damerau-Levenshtein, as in the following example, the function returns 2 because Damerau-Levenshtein accounts for the missing vowels e and o in the second string:
select word_diff('anderson','andrsn',11);
 WORD_DIFF
-----------
         2
(1 row)