When CODEUNITS32, CODEUNITS16, or OCTETS is specified for
a function or the CAST specification, the length attribute of the
final result string is calculated by applying specific formulas depending
on which function is specified.
To determine the final result of a function or the CAST specification, DB2® might need to use an intermediate
result string if CODEUNITS32 or CODEUNITS16 is specified, depending
on the encoding scheme of the data:
- ASCII and EBCDIC data require the use of a UTF-16 intermediate
result string when either CODEUNITS32 or CODEUNITS16 is specified.
- UTF-8 data requires the use of a UTF-16 intermediate result string
only when CODEUNITS16 is specified.
Regardless of whether an intermediate string is used, when CODEUNITS32,
CODEUNITS16, or OCTETS is specified for a function or the CAST specification,
the length attribute of the final result string is calculated by applying
the formulas that are described in the following table. The length
attributes that are calculated at each step in the formulas are measured
in bytes, unless indicated otherwise.
Table 1. Formulas
for the length attribute of the final result stringFunction |
Determination of the length attribute of the string1 |
CAST specification
CHAR
CLOB
DBCLOB
GRAPHIC
VARCHAR
VARGRAPHIC
|
Follow these three steps to determine the length
attribute of the final result: - 1. Length of the intermediate string (IML)
-
When CODEUNITS32 or CODEUNITS16 is specified:
- If the source string is not in Unicode CCSID 1200, 1208, or 367,
convert the source string to CCSID 1200, using the formulas in Conversion rules for comparisons to
determine the result length of the intermediate string (IML).
- If source string is in Unicode CCSID 1208 or 367, and CODEUNITS16
is specified, convert the source string to CCSID 1200, using the formulas
in Conversion rules for comparisons to
determine the result length of the intermediate string (IML).
- Otherwise, the intermediate string is the same as the source string.
When OCTETS is specified:
- If the CCSID of the source string is different from the CCSID
of the result of the function, convert the source string to the CCSID
of the result of the function, using the formulas in Conversion rules for comparisons to
determine the result length of the intermediate string (IML).
- Otherwise, the intermediate string is the same as the source string.
Exception: For the GRAPHIC and VARGRAPHIC function,
if the source string is EBCDIC, the source is widened with prefix X'42' before
the source string is converted to CCSID 1200 and the length of the
intermediate string is determined.
- 2. Result length attribute of the intermediate string (rl)
- The result length (rl) of the intermediate string depends on whether
a length argument was explicitly specified.
If length was
not specified, the result length (rl) attribute is: rl = IML
If length was
specified, the result length (rl) attribute is: IF (ol * n) < r_IML THEN
rl = ol * n
ELSE
IF intermediate string is in CCSID 1200
(UTF-16) THEN
rl = MIN( ol * n , IML + ( r * 2 ))
ELSE
rl = MIN( ol * n , IML + r )
Where:
- 3. Length of the final result string (the result
of the function)
- The result length attribute of the final string is determined
by converting the result length (rl) of the intermediate string to
the CCSID of the result of the function, using the formulas in Conversion rules for comparisons,
if CCSID conversion is necessary. Otherwise, the result length attribute
of the final string is rl.
|
CHARACTER_
LENGTH
LOCATE
LOCATE_
IN_STRING
POSITION
|
Follow these three steps to determine the length
attribute of the final result: - 1. Length of the intermediate string (IML)
- The length of the intermediate string (IML) is determined the
same way as for the CAST specification. (See Length of the intermediate
string (IML).)
For the LOCATE, LOCATE_IN_STRING,
and POSITION functions, this applies to both the source-string and search-string.
If the CCSIDs of intermediate strings for the converted source-string and search-string differ,
the intermediate string for the search-string is
converted to the CCSID of intermediate string for the source-string.
- 2. Result length attribute of the intermediate string (rl)
- The result length (rl) attribute is always 4 (the length of an
integer):
rl = 4
- 3. Length of the final result string (the result the function)
- The length of the final result of the function is always an integer.
|
INSERT
OVERLAY
|
Follow these three steps to determine the length
attribute of the final result: - 1. Length of the intermediate string (IML)
- The length of the intermediate string (IML) for both the source-string and
the insert-string is determined the same
way as for the CAST specification. (See Length of the intermediate
string (IML).)
If the CCSIDs of the intermediate strings
for the converted source-string and insert-string differ,
the intermediate string for the insert-string is
converted to the CCSID of the intermediate string for the source-string.
- 2. Result length attribute of the intermediate string (rl)
- The result length (rl) attribute of the intermediate string depends
on whether the start and length arguments are constants.
If the start and length arguments
are both constants, the result length attribute is: rl = L1 - MIN ( MAX ( 0, L1 - (V2 - 1)
* n ), V3 * m) + L4
If at least one argument
(the start or length argument) is not a constant, the
result length attribute is: rl = L1 + L4
Where:
- L1 and L4 are the length attributes of the intermediate strings
of the source-string and insert-string,
respectively.
- V2 and V3 are the start and length values, respectively,
expressed in the specified string units.
- m=
- 1 if the intermediate string of the source-string is
not CCSID 1200 (UTF-16)
- 2 if the intermediate string of the source-string is
CCSID 1200 (UTF-16)
- n=
- 4 bytes for CODEUNITS32
-
- 2 bytes for CODEUNITS16
- 3. Length of the final result string (the result the function)
- The length of the final result is the same as the length of the
final result for the CAST specification. (See Length attribute of
the final result string (the result of the function).)
|
LEFT
RIGHT
|
Follow these three steps to determine the length
attribute of the final result: - 1. Length of the intermediate string (IML)
- The length of the intermediate string (IML) is determined the
same way as for the CAST specification. (See Length of the intermediate
string (IML).)
- 2. Result length attribute of the intermediate string (rl)
- The result length (rl) attribute is the same as the length of
the intermediate string:
rl = IML
- 3. Length of the final result string (the result of the function)
- The result length attribute of the final string is determined
by converting the result length (rl) of the intermediate string to
the CCSID of the result of the function, using the formulas in Conversion rules for comparisons,
if CCSID conversion is necessary. Otherwise, the result length attribute
of the final string is rl.
The result length attribute of the
final string is: MIN(length of source string, length of CCSID
converted string )
|
SUBSTRING |
Follow these three steps to determine the length
attribute of the final result: - 1. Length of the intermediate string (IML)
- The length of the intermediate string (IML) is determined the
same way as for the CAST specification. (See Length of the intermediate
string (IML).)
- 2. Result length attribute of the intermediate string (rl)
- The result length (rl) of the intermediate string depends on whether
a length argument was explicitly specified.
If length was
not specified, the result length (rl) attribute is: rl = IML
If length was
specified, the result length (rl) attribute is: rl = MIN(ol * n, IML)
Where:
- ol = original length argument, expressed in the specified
string units
- n =
- 4 bytes for CODEUNITS32
- 2 bytes for CODEUNITS16
- IML = length of the intermediate string
- 3. Length of the final result string (the result of the function)
- The length of the final result string is the same as for LEFT
built-in function.
|
Note: - The final value of the calculation for each length attribute (IML,
rl, and the final result of the function) is limited by the maximum
length of the function or by the maximum length of the corresponding
data type of the result, whichever is applicable. Each length attribute
is expressed in terms of bytes.
|
Example 1: Assume that T1 is a table encoded in EBCDIC and
C1 is a CHAR(26) column (SBCS data with EBCDIC CCSID 37). The CHAR
function is invoked in the following statement:
SELECT CHAR(C1,10,CODEUNITS32) as COL1 FROM T1;
DB2 uses an intermediate string to evaluate the function and determines the intermediate and final result string lengths using these steps:
- C1, which is SBCS EBCDIC 37 data, is converted to Unicode 1200
(UTF-16). The result length of the conversion (using the formula from Conversion rules for comparisons,
X * 2) is 26 * 2. Thus, the length of the intermediate string is 52
bytes (IML = 52).
- The CHAR function is evaluated against the first 10 UTF-32 characters
in this string. The result length attribute is 40 bytes (rl
= ol * n or 10 * 4) because ol * n < r_IML or
40 < 52.
- The 40 bytes of the string are converted back to SBCS EBCDIC 37.
The result length of the conversion (using the formula from Conversion rules for comparisons,
X * .5) is 40 * .5. Thus, the length of the final result of the functions
is 20 bytes.
Example 2: This example is similar to the first example,
except that the specified length for the function is 20 instead of
10. Assume that T1 is a table encoded in EBCDIC and C1 is a CHAR(26)
column (SBCS data with EBCDIC CCSID 37). The CHAR function is invoked
in the following statement:
SELECT CHAR(C1,20,CODEUNITS32) as COL1 FROM T1;
DB2 uses an intermediate string
to evaluate the function and determines the intermediate and final
result string lengths using these steps:
- C1, which is SBCS EBCDIC 37 data, is converted to Unicode 1200
(UTF-16). The result length of the conversion (using the formula from Conversion rules for comparisons,
X * 2) is 26 * 2. Thus, the length of the intermediate result string
is 52 bytes (IML = 52).
- The CHAR function is evaluated against the first 20 UTF-32 characters
in this intermediate string. However, because the estimated number
of characters in the intermediate string, as expressed in the specified
string units, is only 13 characters (r_IML/n or 52/4),
the intermediate string must be padded with 7 padding characters to
satisfy the 20 characters that are requested (r = ol - (r_IML/n)
or 20 - 13). In Unicode 1200 (UTF-16), each padding character
takes 2 bytes.
The result length attribute is then calculated to
be 66 bytes (rl = MIN(ol * n, IML + (r * 2)) or MIN(20 * 4,
52 + 14)) because ol * n < r_IML or 80 < 52 is
not true.
- The 66 bytes of the string are converted back to SBCS EBCDIC 37.
The result length of the conversion (using the formula from Conversion rules for comparisons,
X * .5) is 66 * .5. Thus, the length of the final result of the function
is 33 bytes.