Advanced search tips
You can change the way that content is searched by using a fuzzy search, same-sentence search, or stemmed search. Advanced searching is relevant only for text index searches. Text index searches are performed on content that has a full-text index. Ask your eDiscovery administrator whether you can use advanced search techniques on the content that you are searching.
Fuzzy search
A fuzzy search returns words that are spelled in a similar way to the search term. The words might or might not be related to each other. Fuzzy searches are especially useful when you have content that might contain misspelled words.
A fuzzy search takes the following form:
Term~n
where Term is a search word and n is a similarity value that is greater than 0.0 and less than 1.0.
- Lear~0.7
- Searches with a similarity value of 0.7
- Lear~0.5
- Searches with a similarity value of 0.5
- King AND Lear~0.5
- Searches for exact matches to King and fuzzy matches to Lear
- Lear~0.5 NOT lean
- Searches for a fuzzy match to Lear but does not return matches for the word lean, which might be a fuzzy match to Lear
Same-sentence search
Same-sentence searches, also known as proximity searches, are useful when you believe that two words might not always occur in the same order, but usually occur within the same sentence.
Same-sentence searching is not supported inFileNet P8 environments with IBM Content Search Services. Attempts to perform same-sentence searches in this environment return no results.
Lear is the most tragic king of all of Shakespeare's characters.
(Term1 Term2) WITHIN SENTENCE
("King" "Lear") WITHIN SENTENCE
("Cordelia" "King Lear") WITHIN SENTENCE
(("King" "Louis") WITHIN SENTENCE) NOT nomination
A
same-sentence search is performed on (King Louis) WITHIN
SENTENCE and content is returned that contains the words king and louis in
the same sentence but does not contain the word nomination.Support for same-sentence search is provided by DB2® Net Search Extender, where this feature is sometimes also called proximity search. For more information about how DB2 Net Search Extender defines the end of a sentence, see the Paragraphs section of the Tokenization topic.
Same-sentence searching is not supported for content that is archived with IBM FileNet Email Manager and that is stored in an IBM FileNet P8 server.
Stemmed search
Stemmed searches are a good way to search for words with the same word stem and regular endings.
Searching for the stemmed form of a term means reducing the term to its word stem and then searching on the word stem (also known as the base word). For example, searching for the word grows as a stemmed search returns content with the words grow, grows, and growing, but not growth, grown, or grew.
- Terms marked for fuzzy search
- Terms that contain wildcard characters
- Phrases (text surrounded by double quotation marks)
- Same-sentence searches
For example, if you specify election OR nomination OR president~ OR hold* OR (King Lear) WITHIN SENTENCE as the search terms and then elect to perform a stemmed search, the stemmed search will apply only to the terms election and nomination.
Support for stemmed searches is provided by DB2 Net Search Extender. For complete information about stemmed searches, see the Net Search Extender Administration and User's Guide.
- The word "better" has "good" as its lemma. This link is missed by stemming, because it requires a dictionary look-up.
- The word "walk" is the base form for word "walking", and so "walk" is matched in both stemming and lemmatisation.
- The word "versioning" can be either the base form of a noun or a form of a verb (meaning to version) depending on the context. Lemmatisation can determine the correct lemma for "versioning" based on context. For example, in the sentence, "The versioning support in this product is fantastic," the lemmatisation algorithm would select the noun form of "versioning" and identify the lemma as "versioning" which is the original search token.
Search across a range of values in integer fields
TIEFLAG: 10000
TIEFLAG: =10000
TIEFLAG: <10000
TIEFLAG: <>10000
TIEFLAG: >=10000 AND <=20000
TIEFLAG: >=10000 AND <=20000 OR =15000
TIEFLAG: !=5000 AND (>20000 OR <10000) AND !=25000
relational_operator integer [ boolean_operator relational_operator integer] [ boolean_operator relational_operator integer] ...
where: relational_operator can be >, <, >=, <=, =, != or <>
boolean_operator can either be AND or OR
The implicit order of operator precedence is AND, followed by OR. Parentheses can be used to override the implicit order.