LIKE predicate
The LIKE predicate searches for strings that have a certain pattern. The pattern is specified by a string in which the underscore and the percent sign may have special meanings. Trailing blanks in a pattern are part of the pattern.
If the value of any of the arguments is null, the result of the LIKE predicate is unknown.
The values for match-expression, pattern-expression, and escape-expression are compatible string expressions. There are slight differences in the types of string expressions supported for each of the arguments. The valid types of expressions are listed under the description of each argument.
None of the expressions can yield a distinct type. However, it can be a function that casts a distinct type to its source type.
- match-expression
- An expression that specifies the string that is to be examined to see if it conforms to a certain pattern of characters.
- LIKE pattern-expression
- An expression that specifies the string that is to be matched. The expression can be specified in the same way as match-expression with the following restrictions:
- No element in the expression can be of type CLOB or DBCLOB. In addition it cannot be a BLOB file reference variable.
- The actual length of pattern-expression cannot be more than 32 672 bytes.
A simple description of the use of the LIKE predicate is that the pattern is used to specify the conformance criteria for values in the match-expression, where:- The underscore character (_) represents any single character.
- The percent sign (%) represents a string of zero or more characters.
- Any other character represents itself.
If the pattern-expression needs to include either the underscore or the percent character, the escape-expression is used to specify a character to precede either the underscore or the percent character in the pattern.
A rigorous description of the use of the LIKE predicate follows. Note that this description ignores the use of the escape-expression; its use is covered later.- Let m denote the value of match-expression
and let p denote the value of
pattern-expression. The string p is interpreted
as a sequence of the minimum number of substring specifiers so each character of
p is part of exactly one substring specifier. A substring specifier is an
underscore, a percent sign, or any non-empty sequence of characters other than an underscore or a
percent sign. The result of the predicate is unknown if m or p is the null value. Otherwise, the result is either true or false. The result is true if m and p are both empty strings or there exists a partitioning of m into substrings such that:
- A substring of m is a sequence of zero or more contiguous characters and each character of m is part of exactly one substring.
- If the nth substring specifier is an underscore, the nth substring of m is any single character.
- If the nth substring specifier is a percent sign, the nth substring of m is any sequence of zero or more characters.
- If the nth substring specifier is neither an underscore nor a percent sign, the nth substring of m is equal to that substring specifier and has the same length as that substring specifier.
- The number of substrings of m is the same as the number of substring specifiers.
Thus, if p is an empty string and m is not an empty string, the result is false. Similarly, it follows that if m is an empty string and p is not an empty string (except for a string containing only percent signs), the result is false.
The predicate m NOT LIKE p is equivalent to the search condition NOT (m LIKE p).
When the escape-expression is specified, the pattern-expression must not contain the escape character identified by the escape-expression, except when immediately followed by the escape character, the underscore character, or the percent sign character (SQLSTATE 22025).
If the match-expression is a character string in an MBCS database, it can contain mixed data. In this case, the pattern can include both SBCS and non-SBCS characters. For non-Unicode databases, the special characters in the pattern are interpreted as follows:- An SBCS halfwidth underscore refers to one SBCS character.
- A non-SBCS fullwidth underscore refers to one non-SBCS character.
- An SBCS halfwidth or non-SBCS fullwidth percent sign refers to zero or more SBCS or non-SBCS characters.
In a Unicode database, there is really no distinction between
single-byte
andnon-single-byte
characters. Although the UTF-8 format is amixed-byte
encoding of Unicode characters, there is no real distinction between SBCS and non-SBCS characters in UTF-8. Every character is a Unicode character, regardless of the number of bytes in UTF-8 format.In a Unicode graphic column, every non-supplementary character, including the halfwidth underscore character (U&'\005F') and the halfwidth percent sign character (U&'\0025'), is two bytes in width. In a Unicode database, special characters in a pattern are interpreted as follows:- For character strings, a halfwidth underscore character (X'5F') or a fullwidth underscore character (X'EFBCBF') refers to one Unicode character, and a halfwidth percent sign character (X'25') or a fullwidth percent sign character (X'EFBC85') refers to zero or more Unicode characters.
- For graphic strings, a halfwidth underscore character (U&'\005F') or a fullwidth underscore character (U&'\FF3F') refers to one Unicode character, and a halfwidth percent sign character (U&'\0025') or a fullwidth percent sign character (U&'\FF05') refers to zero or more Unicode characters.
- To be recognized as special characters when a locale-sensitive UCA-based collation is in effect, the underscore character and the percent sign character must not be followed by non-spacing combining marks (diacritics). For example, the pattern U&'%\0300' (percent sign character followed by non-spacing combining grave accent) will be interpreted as a search for and not as a search for zero or more Unicode characters followed by a letter with a grave accent.
A Unicode supplementary character is stored as two graphic code points in a Unicode graphic column. To match a Unicode supplementary character in a Unicode graphic column, use one underscore if the database uses locale-sensitive UCA-based collation, and two underscores otherwise. To match a Unicode supplementary character in a Unicode character column, use one underscore for all collations. To match a base character with one or more trailing non-spacing combining characters, use one underscore if the database uses locale-sensitive UCA-based collation. Otherwise, use as many underscore characters as the number of non-spacing combining characters plus the base character.
- escape-expression
- This optional argument is an expression that specifies a character to be used to modify the
special meaning of the underscore (_) and percent (%) characters in the
pattern-expression. This allows the LIKE predicate to be used to match
values that contain the actual percent and underscore characters. The expression can be specified by any one of:
- A constant
- A special register
- A global variable
- A host variable
- A scalar function with any of the previously mentioned operands
- An expression concatenating any of the previously listed items
with the restrictions that:- No element in the expression can be of type CLOB or DBCLOB. In addition, it cannot be a BLOB file reference variable.
- For character columns, the result of the expression must be one character, or a binary string containing exactly one byte (SQLSTATE 22019).
- For graphic columns, the result of the expression must be one character (SQLSTATE 22019).
- The result of the expression must not be a non-spacing combining character sequence (for example, U&'\0301', Combining Acute Accent).
When escape characters are present in the pattern string, an underscore, percent sign, or escape character can represent a literal occurrence of itself. This is true if the character in question is preceded by an odd number of successive escape characters. It is not true otherwise.
In a pattern, a sequence of successive escape characters is treated as follows:- Let S be such a sequence, and suppose that S is not part of a larger sequence of successive
escape characters. Suppose also that S contains a total of n characters. Then the rules governing S
depend on the value of n:
- If n is odd, S must be followed by an underscore or percent sign (SQLSTATE 22025). S and the character that follows it represent (n-1)/2 literal occurrences of the escape character followed by a literal occurrence of the underscore or percent sign.
- If n is even, S represents n/2 literal occurrences of the escape character. Unlike the case where n is odd, S could end the pattern. If it does not end the pattern, it can be followed by any character (except, of course, an escape character, which would violate the assumption that S is not part of a larger sequence of successive escape characters). If S is followed by an underscore or percent sign, that character has its special meaning.
Following is an illustration of the effect of successive occurrences of the escape character which, in this case, is the back slash (\).- Pattern string
- Actual Pattern
- \%
- A percent sign
- \\%
- A back slash followed by zero or more arbitrary characters
- \\\%
- A back slash followed by a percent sign
- The match-expression value is never converted.
- If the code page of pattern-expression is different from the code page of match-expression, the value of pattern-expression is converted to the code page of match-expression, unless either operand is defined as FOR BIT DATA or is a binary type (in which case there is no conversion).
- If the code page of escape-expression is different from the code page of match-expression, the value of escape-expression is converted to the code page of match-expression, unless either operand is defined as FOR BIT DATA or is a binary type (in which case there is no conversion).
Notes
- The number of trailing blanks is significant in both the match-expression and
the pattern-expression. If the strings are
not the same length, the shorter string is not padded with blank spaces.
For example, the expression
'PADDED ' LIKE 'PADDED'
would not result in a match. - If the pattern specified in a LIKE predicate is a parameter marker, and a fixed-length character
host variable is used to replace the parameter marker, the value specified for the host variable
must have the correct length. If the correct length is not specified, the select operation will not
return the intended results. For example, if the host variable is defined as CHAR(10), and the value WYSE% is assigned to that host variable, the host variable is padded with blanks on assignment. The pattern used is:
The database manager searches for all values that start with WYSE and that end with five blank spaces. If you want to search only for values that start with'WYSE% '
WYSE
, assign a value ofWSYE%%%%%%
to the host variable. - The pattern is matched using the collation of the database, unless either operand is defined as FOR BIT DATA, in which case the pattern is matched using a binary comparison.
Examples
- Search for the string
SYSTEMS
appearing anywhere within the PROJNAME column in the PROJECT table.SELECT PROJNAME FROM PROJECT WHERE PROJECT.PROJNAME LIKE '%SYSTEMS%'
- Search for a string with a first character of
J
that is exactly two characters long in the FIRSTNME column of the EMPLOYEE table.SELECT FIRSTNME FROM EMPLOYEE WHERE EMPLOYEE.FIRSTNME LIKE 'J_'
- Search for a string of any length, with a first character of
J
, in the FIRSTNME column of the EMPLOYEE table.SELECT FIRSTNME FROM EMPLOYEE WHERE EMPLOYEE.FIRSTNME LIKE 'J%'
- In the CORP_SERVERS table, search for a string in the LA_SERVERS
column that matches the value in the CURRENT SERVER special register.
SELECT LA_SERVERS FROM CORP_SERVERS WHERE CORP_SERVERS.LA_SERVERS LIKE CURRENT SERVER
- Retrieve all strings
that begin with the character sequence
_\
in column A of table T.SELECT A FROM T WHERE T.A LIKE '\_\\%' ESCAPE '\'
- Use a binary
string constant to specify a one-byte escape character that is compatible with the match and pattern
data types (both BLOBs).
SELECT COLBLOB FROM TABLET WHERE COLBLOB LIKE :pattern_var ESCAPE BX'0E'
- In a Unicode database defined with the case insensitive collation
CLDR181_LEN_S1, find all names that start with
Bill
.
This query returns the namesSELECT NAME FROM CUSTDATA WHERE NAME LIKE 'Bill%'
Bill Smith
,billy simon
, andBILL JONES
.