GitHubContribute in GitHub: Edit online

String operators

KQL offers a variety of query operators for searching string data types. The following article describes how string terms are indexed, lists the string query operators, and gives tips for optimizing performance.

Understanding string terms

KQL indexes all columns, including columns of type string. Multiple indexes are built for such columns, depending on the actual data. These indexes aren't directly exposed, but are used in queries with the string operators that have has as part of their name, such as has, !has, hasprefix, !hasprefix. The semantics of these operators are dictated by the way the column is encoded. Instead of doing a "plain" substring match, these operators match terms.

What is a term?

By default, each string value is broken into maximal sequences of ASCII alphanumeric characters, and each of those sequences is made into a term. For example, in the following string, the terms are KQL, KQLExplorerQueryRun, and the following substrings: ad67d136, c1db, 4f9f, 88ef, d94f3b6b0b5a.

KQL: ad67d136-c1db-4f9f-88ef-d94f3b6b0b5a;KustoExplorerQueryRun

KQL builds a term index consisting of all terms that are three characters or more, and this index is used by string operators such as has, !has, and so on. If the query looks for a term that is smaller than three characters, or uses a contains operator, then the query will revert to scanning the values in the column. Scanning is much slower than looking up the term in the term index.

Operators on strings

The following abbreviations are used in the table below:

  • RHS = right hand side of the expression
  • LHS = left hand side of the expression

Operators with an _cs suffix are case sensitive.

Case-insensitive operators are currently supported only for ASCII-text. For non-ASCII comparison, use the tolower() function.

Operator Description Case-Sensitive Example (yields true)
== Equals Yes "aBc" == "aBc"
!= Not equals Yes "abc" != "ABC"
=~ Equals No "abc" =~ "ABC"
!~ Not equals No "aBc" !~ "xyz"
contains RHS occurs as a subsequence of LHS No "FabriKam" contains "BRik"
!contains RHS doesn't occur in LHS No "Fabrikam" !contains "xyz"
contains_cs RHS occurs as a subsequence of LHS Yes "FabriKam" contains_cs "Kam"
!contains_cs RHS doesn't occur in LHS Yes "Fabrikam" !contains_cs "Kam"
endswith RHS is a closing subsequence of LHS No "Fabrikam" endswith "Kam"
!endswith RHS isn't a closing subsequence of LHS No "Fabrikam" !endswith "brik"
endswith_cs RHS is a closing subsequence of LHS Yes "Fabrikam" endswith_cs "kam"
!endswith_cs RHS isn't a closing subsequence of LHS Yes "Fabrikam" !endswith_cs "brik"
has Right-hand-side (RHS) is a whole term in left-hand-side (LHS) No "North America" has "america"
!has RHS isn't a full term in LHS No "North America" !has "amer"
has_all Same as has but works on all of the elements No "North and South America" has_all("south", "north")
has_any Same as has but works on any of the elements No "North America" has_any("south", "north")
has_cs RHS is a whole term in LHS Yes "North America" has_cs "America"
!has_cs RHS isn't a full term in LHS Yes "North America" !has_cs "amer"
hasprefix RHS is a term prefix in LHS No "North America" hasprefix "ame"
!hasprefix RHS isn't a term prefix in LHS No "North America" !hasprefix "mer"
hasprefix_cs RHS is a term prefix in LHS Yes "North America" hasprefix_cs "Ame"
!hasprefix_cs RHS isn't a term prefix in LHS Yes "North America" !hasprefix_cs "CA"
hassuffix RHS is a term suffix in LHS No "North America" hassuffix "ica"
!hassuffix RHS isn't a term suffix in LHS No "North America" !hassuffix "americ"
hassuffix_cs RHS is a term suffix in LHS Yes "North America" hassuffix_cs "ica"
!hassuffix_cs RHS isn't a term suffix in LHS Yes "North America" !hassuffix_cs "icA"
in Equals to one of the elements Yes "abc" in ("123", "345", "abc")
!in Not equals to any of the elements Yes "bca" !in ("123", "345", "abc")
in~ Equals to any of the elements No "Abc" in~ ("123", "345", "abc")
!in~ Not equals to any of the elements No "bCa" !in~ ("123", "345", "ABC")
matches regex LHS contains a match for RHS Yes "Fabrikam" matches regex "b.*k"
startswith RHS is an initial subsequence of LHS No "Fabrikam" startswith "fab"
!startswith RHS isn't an initial subsequence of LHS No "Fabrikam" !startswith "kam"
startswith_cs RHS is an initial subsequence of LHS Yes "Fabrikam" startswith_cs "Fab"
!startswith_cs RHS isn't an initial subsequence of LHS Yes "Fabrikam" !startswith_cs "fab"

All operators containing has search on indexed terms of three or more characters, and not on substring matches. A term is created by breaking up the string into sequences of ASCII alphanumeric characters. See understanding string terms.

Performance tips

For better performance, when there are two operators that do the same task, use the case-sensitive one. For example:

  • Use ==, not =~
  • Use in, not in~
  • Use hassuffix_cs, not hassuffix

For faster results, if you're testing for the presence of a symbol or alphanumeric word that is bound by non-alphanumeric characters, or the start or end of a field, use has or in. has works faster than contains, startswith, or endswith.