String Functions
In CLEM, you can perform the following operations with strings:
- Compare strings
- Create strings
- Access characters
In CLEM, a string is any sequence of characters between matching double quotation marks ("string quotes"). Characters (CHAR) can be any single alphanumeric character. They are declared in CLEM expressions using single backquotes in the form of `<character>`, such as `z`, `A`, or `2`. Characters that are out-of-bounds or negative indices to a string will result in undefined behavior.
Note. Comparisons between strings that do and do not use SQL pushback may generate different results where trailing spaces exist.
Function | Result | Description |
---|---|---|
allbutfirst(N, STRING) | String | Returns a string, which is STRING with the first N characters removed. |
allbutlast(N, STRING) | String | Returns a string, which is STRING with the last characters removed. |
alphabefore(STRING1, STRING2) | Boolean | Used to check the alphabetical ordering of strings. Returns true if STRING1 precedes STRING2. |
endstring(LENGTH, STRING) | String | Extracts the last N characters from the specified string. If the string length is less than or equal to the specified length, then it is unchanged. |
hasendstring(STRING, SUBSTRING) | Integer | This function is the same as isendstring(SUBSTRING, STRING). |
hasmidstring(STRING, SUBSTRING) | Integer | This function is the same as ismidstring(SUBSTRING, STRING) (embedded substring). |
hasstartstring(STRING, SUBSTRING) | Integer | This function is the same as isstartstring(SUBSTRING, STRING). |
hassubstring(STRING, N, SUBSTRING) | Integer | This function is the same as issubstring(SUBSTRING, N, STRING), where N defaults to 1. |
count_substring(STRING, SUBSTRING) | Integer | Returns the number of times the specified substring occurs within the string. For example, count_substring("foooo.txt", "oo") returns 3. |
hassubstring(STRING, SUBSTRING) | Integer | This function is the same as issubstring(SUBSTRING, 1, STRING), where N defaults to 1. |
isalphacode(CHAR) | Boolean | Returns a value of true if CHAR is a character in the specified string (often a field name) whose character code is a letter. Otherwise, this function returns a value of 0. For example, isalphacode(produce_num(1)). |
isendstring(SUBSTRING, STRING) | Integer | If the string STRING ends with the substring SUBSTRING, then this function returns the integer subscript of SUBSTRING in STRING. Otherwise, this function returns a value of 0. |
islowercode(CHAR) | Boolean | Returns a value of true if CHAR is a lowercase letter character for the specified string (often a field name). Otherwise, this function returns a value of 0. For example, both islowercode(``) and islowercode(country_name(2)) are valid expressions. |
ismidstring(SUBSTRING, STRING) | Integer | If SUBSTRING is a substring of STRING but does not start on the first character of STRING or end on the last, then this function returns the subscript at which the substring starts. Otherwise, this function returns a value of 0. |
isnumbercode(CHAR) | Boolean | Returns a value of true if CHAR for the specified string (often a field name) is a character whose character code is a digit. Otherwise, this function returns a value of 0. For example, isnumbercode(product_id(2)). |
isstartstring(SUBSTRING, STRING) | Integer | If the string STRING starts with the substring SUBSTRING, then this function returns the subscript 1. Otherwise, this function returns a value of 0. |
issubstring(SUBSTRING, N, STRING) | Integer | Searches the string STRING, starting from its Nth character, for a substring equal to the string SUBSTRING. If found, this function returns the integer subscript at which the matching substring begins. Otherwise, this function returns a value of 0. If N is not given, this function defaults to 1. |
issubstring(SUBSTRING, STRING) | Integer | Searches the string STRING, starting from its Nth character, for a substring equal to the string SUBSTRING. If found, this function returns the integer subscript at which the matching substring begins. Otherwise, this function returns a value of 0. If N is not given, this function defaults to 1. |
issubstring_count(SUBSTRING, N, STRING): | Integer | Returns the index of the Nth occurrence of SUBSTRING within the specified STRING. If there are fewer than N occurrences of SUBSTRING, 0 is returned. |
issubstring_lim(SUBSTRING, N, STARTLIM, ENDLIM, STRING) | Integer | This function is the same as issubstring, but the match is constrained to start on or before the subscript STARTLIM and to end on or before the subscript ENDLIM. The STARTLIM or ENDLIM constraints may be disabled by supplying a value of false for either argument—for example, issubstring_lim(SUBSTRING, N, false, false, STRING) is the same as issubstring. |
isuppercode(CHAR) | Boolean | Returns a value of true if CHAR is an uppercase letter character. Otherwise, this function returns a value of 0. For example, both isuppercode(``) and isuppercode(country_name(2)) are valid expressions. |
last(CHAR) | String | Returns the last character CHAR of STRING (which must be at least one character long). |
length(STRING) | Integer | Returns the length of the string STRING--that is, the number of characters in it. |
locchar(CHAR, N, STRING) | Integer | Used to identify the location of characters in symbolic fields. The function searches the string STRING for the character CHAR, starting the search at the Nth character of STRING. This function returns a value indicating the location (starting at N) where the character is found. If the character is not found, this function returns a value of 0. If the function has an invalid offset (N) (for example, an offset that is beyond the length of the string), this function returns $null$. For example, locchar(`n`, 2, web_page) searches the field called web_page for the `n` character beginning at the second character in the field value. Note: Be sure to use single backquotes to encapsulate the specified character. |
locchar_back(CHAR, N, STRING) | Integer | Similar to locchar, except that the search is performed backward starting from the Nth character. For example, locchar_back(`n`, 9, web_page) searches the field web_page starting from the ninth character and moving backward toward the start of the string. If the function has an invalid offset (for example, an offset that is beyond the length of the string), this function returns $null$. Ideally, you should use locchar_back in conjunction with the function length(<field>) to dynamically use the length of the current value of the field. For example, locchar_back(`n`, (length(web_page)), web_page). |
lowertoupper(CHAR) lowertoupper (STRING) | CHAR or String | Input can be either a string or character, which is used in this function to return a new item of the same type, with any lowercase characters converted to their uppercase equivalents. For example, lowertoupper(`a`), lowertoupper(“My string”), and lowertoupper(field_name(2)) are all valid expressions. |
matches | Boolean | Returns true if a string matches a specified pattern. The pattern must be a string literal; it cannot be a field name containing a pattern. A question mark (?) can be included in the pattern to match exactly one character; an asterisk (*) matches zero or more characters. To match a literal question mark or asterisk (rather than using these as wildcards), a backslash (\) can be used as an escape character. |
replace(SUBSTRING, NEWSUBSTRING, STRING) | String | Within the specified STRING, replace all instances of SUBSTRING with NEWSUBSTRING. |
replicate(COUNT, STRING) | String | Returns a string that consists of the original string copied the specified number of times. |
stripchar(CHAR,STRING) | String | Enables you to remove specified characters from a string or field. You can use this function, for example, to remove extra symbols, such as currency notations, from data to achieve a simple number or name. For example, using the syntax stripchar(`$`, 'Cost') returns a new field with the dollar sign removed from all values. Note: Be sure to use single backquotes to encapsulate the specified character. |
skipchar(CHAR, N, STRING) | Integer | Searches the string STRING for any character other than CHAR, starting at the Nth character. This function returns an integer substring indicating the point at which one is found or 0 if every character from the Nth onward is a CHAR. If the function has an invalid offset (for example, an offset that is beyond the length of the string), this function returns $null$. locchar is often used in conjunction with the skipchar functions to determine the value of N (the point at which to start searching the string). For example, skipchar(`s`, (locchar(`s`, 1, "MyString")), "MyString"). |
skipchar_back(CHAR, N, STRING) | Integer | Similar to skipchar, except that the search is performed backward, starting from the Nth character. |
startstring(LENGTH, STRING) | String | Extracts the first N characters from the specified string. If the string length is less than or equal to the specified length, then it is unchanged. |
strmember(CHAR, STRING) | Integer | Equivalent to locchar(CHAR, 1, STRING). It returns an integer substring indicating the point at which CHAR first occurs, or 0. If the function has an invalid offset (for example, an offset that is beyond the length of the string), this function returns $null$. |
subscrs(N, STRING) | CHAR | Returns the Nth character CHAR of the input string STRING. This function can also be written in a shorthand form as STRING(N). For example, lowertoupper(“name”(1)) is a valid expression. |
substring(N, LEN, STRING) | String | Returns a string SUBSTRING, which consists of the LEN characters of the string STRING, starting from the character at subscript N. |
substring_between(N1, N2, STRING) | String | Returns the substring of STRING, which begins at subscript N1 and ends at subscript N2. |
trim(STRING) | String | Removes leading and trailing white space characters from the specified string. |
trim_start(STRING) | String | Removes leading white space characters from the specified string. |
trimend(STRING) | String | Removes trailing white space characters from the specified string. |
unicode_char(NUM) | CHAR | Input must be decimal, not hexadecimal values. Returns the character with Unicode value NUM. |
unicode_value(CHAR) | NUM | Returns the Unicode value of CHAR |
uppertolower(CHAR) uppertolower (STRING) | CHAR or String | Input can be either a string or character and is used in this function to return a new item of the same type with any uppercase characters converted to their lowercase equivalents. Note: Remember to specify strings with double quotes and characters with single backquotes. Simple field names should be specified without quotes. |