Fuzzy search

Use a fuzzy search to find documents that contain words with similar spelling to the term that you are searching.

A fuzzy search query searches for character sequences that are not only the same but similar to the query term. Use the tilde symbol (~) at the end of a term to do a fuzzy search. For example, the following query finds documents that include the terms analytics, analyze, analysis, and so on.
analytics~
You can add an optional parameter to specify the degree of similarity of the search results to the search term. Specify a value greater than or equal to 0 and less than 1. You must precede the value by a 0 and a decimal point, for example, 0.8. A value closer to 1 matches terms with a higher similarity. If you do not specify the parameter, the default is 0.5.
analytics~0.8
You can specify a fuzzy search on a term but not on a phrase. To apply fuzzy search to multiple words in a query, you must apply a fuzzy search factor for each term. For example, the following query finds documents that include terms that are similar to summer and time.
summer~0.7 time~0.7

Example

Step 1. Create a table called BOOKS:
create table books (
         isbn varchar(18) not null primary key, 
         author varchar(30), 
         story varchar(100), 
         year integer);
Step 2. Create a text search index on the STORY column:
db2ts "create index bookidx for text on books(story) connect to test";
Step 3. Import data into the table:
insert into books values ('0-13-086755-1','John','The Blue Can',2001)
insert into books values ('0-13-086755-2','Mike','Cats and Dogs', 2000)
insert into books values ('0-13-086755-3','Peter','Hats on the Rack',1999)
insert into books values ('0-13-086755-4','Agatha','Cat among the Pigeons',1997)
insert into books values ('0-13-086755-5','Edgar','Cars Unlimited',2010)
insert into books values ('0-13-086755-6','Roy','Carson and Lemon',2008) 
Step 4. Update the text search index:
db2ts "update index bookidx for text connect to test"
Step 5. Issue a fuzzy search with the CONTAINS function:
select author, year, story from books where contains(story, 'cat~0.4') = 1
The following is the sample output:
AUTHOR YEAR STORY
------------------------ ----------- -------------------------
John 2001 The Blue Can
Mike 2000 Cats and Dogs
Agatha 1997 Cat among the Pigeons

3 record(s) selected.
To see the associated score, issue the following query that is modified for increased fuzziness:
select author, year, story, integer(score(story, 'cat~0.3')*1000) as score 
from books where contains(story, 'cat~0.3') = 1 order by score desc 
The following is the sample output:
AUTHOR YEAR STORY SCORE
------------------------------ ----------- -------
Agatha 1997 Cat among the Pigeons 32
John 2001 The Blue Can 17
Mike 2000 Cats and Dogs 17
Peter 1999 Hats on the Rack 1
Edgar 2010 Cars Unlimited 1

5 record(s) selected.

Restrictions

  • Special characters are not supported in fuzzy search queries.
  • Terms in fuzzy search queries do not go through language processing (lemmatization, synonym expansion, and stop word removal). Therefore, fuzzy search queries do not find terms that are similar to synonyms.
  • If you include wildcard characters in the fuzzy search terms, only the wildcard search is done.