DB2 Version 10.1 for Linux, UNIX, and Windows

Assignments and comparisons

The basic operations of SQL are assignment and comparison.

Assignment operations are performed during the execution of INSERT, UPDATE, FETCH, SELECT INTO, VALUES INTO and SET transition-variable statements. Arguments of functions are also assigned when invoking a function. Comparison operations are performed during the execution of statements that include predicates and other language elements such as MAX, MIN, DISTINCT, GROUP BY, and ORDER BY.

One basic rule for both operations is that the data type of the operands involved must be compatible. The compatibility rule also applies to set operations.

Another basic rule for assignment operations is that a null value cannot be assigned to a column that cannot contain null values, nor to a host variable that does not have an associated indicator variable.

Following is a compatibility matrix showing the built-in data type compatibilities for assignment and comparison operations.

Table 1. Data type compatibility for assignments and comparisons
Operands Binary Integer Decimal Number Floating- point Decimal Floating-point Character String Graphic String Binary String Date Time Time- stamp Boolean UDT
Binary Integer Yes Yes Yes Yes Yes Yes 5 No No No No No 2
Decimal Number Yes Yes Yes Yes Yes Yes 5 No No No No No 2
Floating-point Yes Yes Yes Yes Yes Yes 5 No No No No No 2
Decimal Floating-point Yes Yes Yes Yes Yes Yes 5 No No No No No 2
Character String Yes Yes Yes Yes Yes Yes 5,6 Yes 3 Yes Yes Yes No 2
Graphic String Yes 5 Yes 5 Yes 5 Yes 5 Yes 5,6 Yes No Yes 5 Yes 5 Yes 5 No 2
Binary String No No No No Yes3 No Yes No No No No 2
Date No No No No Yes Yes 5 No Yes No Yes No 2
Time No No No No Yes Yes 5 No No Yes 1 No 2
Timestamp No No No No Yes Yes 5 No Yes 1 Yes No 2
Boolean No No No No No No No No No No Yes 7 2
UDT 2 2 2 2 2 2 2 2 2 2 2 Yes

1 A TIMESTAMP value can be assigned to a TIME value; however, a TIME value cannot be assigned to a TIMESTAMP value and a TIMESTAMP value cannot be compared with a TIME value.

2 For detailed user-defined type information see User-defined type assignments and User-defined type comparisons.

3 Support for assignment only (not comparison) and only for character strings defined as FOR BIT DATA.

4 For information about assignment and comparison of reference types, see Reference type assignments and Reference type comparisons.

5 Only supported for Unicode databases.

6 Bit data and graphic strings are not compatible.

7 Variables of Boolean data type cannot be directly compared; comparison only can be done with the literal values TRUE, FALSE, NULL.

Numeric assignments

For numeric assignments, overflow is not allowed.
  • When assigning to an exact numeric data type, overflow occurs if any digit of the whole part of the number would be eliminated. If necessary, the fractional part of a number is truncated.
  • When assigning to an approximate numeric data type or decimal floating-point, overflow occurs if the most significant digit of the whole part of the number is eliminated. For floating-point and decimal floating-point numbers, the whole part of the number is the number that would result if the floating-point or decimal floating-point number were converted to a decimal number with unlimited precision. If necessary, rounding might cause the least significant digits of the number to be eliminated.

    For decimal floating-point numbers, truncation of the whole part of the number is allowed and results in infinity with a warning.

    For floating-point numbers, underflow is also not allowed. Underflow occurs for numbers between 1 and -1 if the most significant digit other than zero would be eliminated. For decimal floating-point, underflow is allowed and depending on the rounding mode, results in zero or the smallest positive number or the largest negative number that can be represented along with a warning.

    An overflow or underflow warning is returned instead of an error if an overflow or underflow occurs on assignment to a host variable with an indicator variable. In this case, the number is not assigned to the host variable and the indicator variable is set to negative 2.

For decimal floating-point numbers, the CURRENT DECFLOAT ROUNDING MODE special register indicates the rounding mode in effect.

Assignments to integer

When a decimal, floating-point, or decimal floating-point number is assigned to an integer column or variable, the fractional part of the number is eliminated. As a result, a number between 1 and -1 is reduced to 0.

Assignments to decimal

When an integer is assigned to a decimal column or variable, the number is first converted to a temporary decimal number and then, if necessary, to the precision and scale of the target. The precision and scale of the temporary decimal number is 5,0 for a small integer, 11,0 for a large integer, or 19,0 for a big integer.

When a decimal number is assigned to a decimal column or variable, the number is converted, if necessary, to the precision and the scale of the target. The necessary number of leading zeros is added, and in the fractional part of the decimal number the necessary number of trailing zeros is added, or the necessary number of trailing digits is eliminated.

When a floating-point number is assigned to a decimal column or variable, the number is first converted to a temporary decimal number of precision 31, and then, if necessary, truncated to the precision and scale of the target. In this conversion, the number is rounded (using floating-point arithmetic) to a precision of 31 decimal digits. As a result, a number between 1 and -1 that is less than the smallest positive number or greater than the largest negative number that can be represented in the decimal column or variable is reduced to 0. The scale is given the largest possible value that allows the whole part of the number to be represented without loss of significance.

When a decimal floating-point number is assigned to a decimal column or variable, the number is rounded to the precision and scale of the decimal column or variable. As a result, a number between 1 and -1 that is less than the smallest positive number or greater than the largest negative number that can be represented in the decimal column or variable is reduced to 0 or rounded to the smallest positive or largest negative value that can be represented in the decimal column or variable, depending on the rounding mode.

Assignments to floating-point

Floating-point numbers are approximations of real numbers. Hence, when an integer, decimal, floating-point, or decimal floating-point number is assigned to a floating-point column or variable, the result might not be identical to the original number. The number is rounded to the precision of the floating-point column or variable using floating-point arithmetic. A decimal floating-point value is first converted to a string representation, and is then converted to a floating-point number.

Assignments to decimal floating-point

When an integer number is assigned to a decimal floating-point column or variable, the number is first converted to a temporary decimal number and then to a decimal floating-point number. The precision and scale of the temporary decimal number is 5,0 for a small integer, 11,0 for a large integer, or 19,0 for a big integer. Rounding can occur when assigning a BIGINT to a DECFLOAT(16) column or variable.

When a decimal number is assigned to a decimal floating-point column or variable, the number is converted to the precision (16 or 34) of the target. Leading zeros are eliminated. Depending on the precision and scale of the decimal number and the precision of the target, the value might be rounded.

When a floating-point number is assigned to a decimal floating-point column or variable, the number is first converted to a temporary string representation of the floating-point number. The string representation of the number is then converted to decimal floating-point.

When a DECFLOAT(16) number is assigned to a DECFLOAT(34) column or variable, the resulting value is identical to the DECFLOAT(16) number.

When a DECFLOAT(34) number is assigned to a DECFLOAT(16) column or variable, the exponent of the source is converted to the corresponding exponent in the result format. The mantissa of the DECFLOAT(34) number is rounded to the precision of the target.

Assignments from strings to numeric

When a string is assigned to a numeric data type, it is converted to the target numeric data type using the rules for a CAST specification. For more information, see the Related reference section at the end of this topic.

String assignments

There are two types of assignments:
  • In storage assignment, a value is assigned and truncation of significant data is not desirable; for example, when assigning a value to a column
  • In retrieval assignment, a value is assigned and truncation is allowed; for example, when retrieving data from the database

The rules for string assignment differ based on the assignment type.

Storage assignment

The basic rule is that the length of the string assigned to the target must not be greater than the length attribute of the target. If the length of the string is greater than the length attribute of the target, the following actions might occur:
  • The string is assigned with trailing blanks truncated (from all string types except LOB strings) to fit the length attribute of the target
  • An error is returned (SQLSTATE 22001) when:
    • Non-blank characters would be truncated from other than a LOB string
    • Any character (or byte) would be truncated from a LOB string

If a string is assigned to a fixed-length target, and the length of the string is less than the length attribute of the target, the string is padded to the right with the necessary number of single-byte, double-byte, or UCS-2 blanks. The pad character is always a blank, even for columns defined with the FOR BIT DATA attribute. (UCS-2 defines several SPACE characters with different properties. For a Unicode database, the database manager always uses the ASCII SPACE at position x'0020' as UCS-2 blank. For an EUC database, the IDEOGRAPHIC SPACE at position x'3000' is used for padding GRAPHIC strings.)

Retrieval assignment

The length of a string that is assigned to a target can be longer than the length attribute of the target. When a string is assigned to a target, and the length of the string is longer than the length attribute of the target, the string is truncated on the right by the necessary number of characters (or bytes). When this occurs, a warning is returned (SQLSTATE 01004), and the value 'W' is assigned to the SQLWARN1 field of the SQLCA.

Furthermore, if an indicator variable is provided, and the source of the value is not a LOB, the indicator variable is set to the original length of the string.

If a character string is assigned to a fixed-length target, and the length of the string is less than the length attribute of the target, the string is padded to the right with the necessary number of single-byte, double-byte, or UCS-2 blanks. The pad character is always a blank, even for strings defined with the FOR BIT DATA attribute. (UCS-2 defines several SPACE characters with different properties. For a Unicode database, the database manager always uses the ASCII SPACE at position x'0020' as UCS-2 blank. For an EUC database, the IDEOGRAPHIC SPACE at position x'3000' is used for padding GRAPHIC strings.)

Retrieval assignment of C NUL-terminated host variables is handled based on options that are specified with the PREP or BIND command.

Conversion rules for string assignments

A character string or graphic string assigned to a column, variable, or parameter is first converted, if necessary, to the code page of the target. Character conversion is necessary only if all of the following are true:
  • The code pages are different.
  • The string is neither null nor empty.
  • Neither string has a code page value of 0 (FOR BIT DATA).

For Unicode databases, character strings can be assigned to a graphic column, and graphic strings can be assigned to a character column.

MBCS considerations for character string assignments

There are several considerations when assigning character strings that could contain both single and multi-byte characters. These considerations apply to all character strings, including those defined as FOR BIT DATA.
  • Blank padding is always done using the single-byte blank character (X'20').
  • Blank truncation is always done based on the single-byte blank character (X'20'). The double-byte blank character is treated like any other character with respect to truncation.
  • Assignment of a character string to a host variable might result in fragmentation of MBCS characters if the target host variable is not large enough to contain the entire source string. If an MBCS character is fragmented, each byte of the MBCS character fragment in the target is set to a single-byte blank character (X'20'), no further bytes are moved from the source, and SQLWARN1 is set to 'W' to indicate truncation. Note that the same MBCS character fragment handling applies even when the character string is defined as FOR BIT DATA.

DBCS considerations for graphic string assignments

Graphic string assignments are processed in a manner analogous to that for character strings. For non-Unicode databases, graphic string data types are compatible only with other graphic string data types, and never with numeric, character string, or datetime data types. For Unicode databases, graphic string data types are compatible with character string data types. However, graphic and character string data types cannot be used interchangeably in the SELECT INTO or the VALUES INTO statement.

If a graphic string value is assigned to a graphic string column, the length of the value must not be greater than the length of the column.

If a graphic string value (the 'source' string) is assigned to a fixed-length graphic string data type (the 'target', which can be a column, variable, or parameter), and the length of the source string is less than that of the target, the target will contain a copy of the source string which has been padded on the right with the necessary number of double-byte blank characters to create a value whose length equals that of the target.

If a graphic string value is assigned to a graphic string host variable and the length of the source string is greater than the length of the host variable, the host variable will contain a copy of the source string which has been truncated on the right by the necessary number of double-byte characters to create a value whose length equals that of the host variable. (Note that for this scenario, truncation need not be concerned with bisection of a double-byte character; if bisection were to occur, either the source value or target host variable would be an ill-defined graphic string data type.) The warning flag SQLWARN1 in the SQLCA will be set to 'W'. The indicator variable, if specified, will contain the original length (in double-byte characters) of the source string. In the case of DBCLOB, however, the indicator variable does not contain the original length.

Retrieval assignment of C NUL-terminated host variables (declared using wchar_t) is handled based on options specified with the PREP or BIND command.

Assignments from numeric to strings

When a number is assigned to a string data type, it is converted to the target string data type using the rules for a CAST specification. For more information, see the Related reference section at the end of this topic.

If a nonblank character is truncated during the cast of a numeric value to a character or graphic data type, a warning is returned. This truncation behavior is unlike the assignment to a character or graphic data type that follows storage assignment rules, where if a nonblank character is truncated during assignment, an error is returned.

Datetime assignments

A TIME value can be assigned only to a TIME column or to a string variable or string column.

A DATE can be assigned to a DATE, TIMESTAMP or string data type. When a DATE value is assigned to a TIMESTAMP data type, the missing time information is assumed to be all zeros.

A TIMESTAMP value can be assigned to a DATE, TIME, TIMESTAMP or string data type. When a TIMESTAMP value is assigned to a DATE data type, the date portion is extracted and the time portion is truncated. When a TIMESTAMP value is assigned to a TIME data type, the date portion is ignored and the time portion is extracted, but with the fractional seconds truncated. When a TIMESTAMP value is assigned to a TIMESTAMP with lower precision, the excess fractional seconds are truncated. When a TIMESTAMP value is assigned to a TIMESTAMP with higher precision, missing digits are assumed to be zeros.

The assignment must not be to a CLOB, DBCLOB, or BLOB variable or column.

When a datetime value is assigned to a string variable or string column, conversion to a string representation is automatic. Leading zeros are not omitted from any part of the date, time, or timestamp. The required length of the target will vary, depending on the format of the string representation. If the length of the target is greater than required, and the target is a fixed-length string, it is padded on the right with blanks. If the length of the target is less than required, the result depends on the type of datetime value involved, and on the type of target.

When the target is not a host variable and has a character data type, truncation is not allowed. The length attribute of the column must be at least 10 for a date, 8 for a time, 19 for a TIMESTAMP(0), and 20+p for TIMESTAMP(p).

When the target is a string host variable, the following rules apply:
  • For a DATE: If the length of the host variable is less than 10 characters, an error is returned.
  • For a TIME: If the USA format is used, the length of the host variable must not be less than 8 characters; in other formats the length must not be less than 5 characters.

    If ISO or JIS formats are used, and if the length of the host variable is less than 8 characters, the seconds part of the time is omitted from the result and assigned to the indicator variable, if provided. The SQLWARN1 field of the SQLCA is set to indicate the omission.

  • For a TIMESTAMP: If the length of the host variable is less than 19 characters, an error is returned. If the length is less than 32 characters, but greater than or equal to 19 characters, trailing digits of the fractional seconds part of the value are omitted. The SQLWARN1 field of the SQLCA is set to indicate the omission.

When a DATE is assigned to a TIMESTAMP, the time and fractional components of the timestamp are set to midnight and 0, respectively. When a TIMESTAMP is assigned to a DATE, the date portion is extracted and the time and fractional components are truncated.

When a TIMESTAMP is assigned to a TIME, the DATE portion is ignored and the fractional components are truncated.

XML assignments

The general rule for XML assignments is that only an XML value can be assigned to XML columns or to XML variables. There are exceptions to this rule, as follows.

Character string or binary string values cannot be retrieved into XML host variables. Values in XML host variables cannot be assigned to columns, SQL variables, or SQL parameters of a character string data type or a binary string data type.

Assignment to XML parameters and variables in inlined SQL bodied UDFs and SQL procedures is done by reference. Passing parameters of data type XML to invoke an inlined SQL UDF or SQL procedure is also done by reference. When XML values are passed by reference, any input node trees are used directly. This direct usage preserves all properties, including document order, the original node identities, and all parent properties.

User-defined type assignments

Assignments involving user-defined type values generally allow assignment to the same user-defined type name with some additional rules for the different kinds of user-defined types. Additional information about specific user-defined types is in the sections that follow.

Strongly typed distinct type assignments

The rules that apply to the assignments of strongly typed distinct type values to host variables are different than the rules for all other assignments that involve strongly typed distinct type values
Assignments to host variables
The assignment of a strongly typed distinct type value to a host variable (or parameter marker) is based on the source data type of the distinct type. Therefore, the value of a strongly typed distinct type is assignable to a host variable (or parameter marker) only if the source data type of the distinct type is assignable to the variable
For example, the distinct type AGE is created by the following SQL statement:
CREATE TYPE AGE AS SMALLINT
When the statement is executed, the following cast functions are also generated:
AGE (SMALLINT) RETURNS AGE
AGE (INTEGER) RETURNS AGE
SMALLINT (AGE) RETURNS SMALLINT
Next, assume that column STU_AGE was defined in table STUDENTS with distinct type AGE. Now, consider this valid assignment of a student's age to host variable HV_AGE, which has an INTEGER data type:
SELECT STU_AGE INTO :HV_AGE FROM STUDENTS WHERE STU_NUMBER = 200
The strongly typed distinct type value is assignable to host variable HV_AGE because the source data type of the distinct type (SMALLINT) is assignable to the host variable (INTEGER). If strongly typed distinct type AGE had been sourced on a datetime data type such as DATE, the preceding assignment would be invalid because a datetime data type cannot be assigned to an integer type.
Assignments other than to host variables
A strongly typed distinct type can be either the source or target of an assignment. Assignment is based on whether the data type of the value to be assigned is castable to the data type of the target (see Casting between data types for the casts supported when a distinct type is involved). A strongly typed distinct type value can be assigned to any target other than host a variable in the following cases:
  • The target of the assignment has the same distinct type.
  • The distinct type is castable to the data type of the target.
Any value can be assigned to a strongly typed distinct type when:
  • The value to be assigned has the same distinct type as the target.
  • The data type of the assigned value is castable to the target distinct type.
For example, the source data type for strongly typed distinct type AGE is SMALLINT:
CREATE TYPE AGE AS SMALLINT
Next, assume that the tables TABLE1 and TABLE2 are created with four identical column descriptions:
AGECOL AGE
SMINTCOL SMALLINT
INTCOL INTEGER
DECCOL DECIMAL(6,2)
Using the following SQL statement and substituting various values for X and Y to insert values into various columns of TABLE1 from TABLE2. The database manager uses assignment rules in this INSERT statement to determine if X can be assigned to Y. Table 2 shows whether the assignments are valid.
INSERT INTO TABLE1(Y)
   SELECT X FROM TABLE2;
Table 2. Assessment of various assignments
TABLE2.X TABLE1.Y Valid Reason
AGECOL AGECOL Yes Source and target are the same distinct type
SMINTCOL AGECOL Yes SMALLINT can be cast to AGE
INTCOL AGECOL Yes INTEGER can be cast to AGE because the AGE source type is SMALLINT
DECCOL AGECOL No DECIMAL cannot be cast to AGE
AGECOL SMINTCOL Yes AGECOL can be cast to its source type of SMALLINT
AGECOL INTCOL No AGE cannot be cast to INTEGER
AGECOL DECCOL No AGE cannot be cast to DECIMAL

Weakly typed distinct type assignments

The value of a weakly typed distinct type can be assigned according to the assignment rules for the source type of the weakly typed distinct type. When the assignment target has a distinct type defined with data type constraints, the data type constraints are applied to the source value and must evaluate to true or unknown.

Structured type assignments

The value of a structured type can be assigned when the target of the assignment has the same structured type or one of its supertypes except in cases that involve host variables.

Assignment to and from host variables is based on the declared type of the host variable; that is, it follows the rule:
  • A source value of a structured type is assignable to a target host variable if and only if the declared type of the host variable is the structured type or a supertype of the structured type.

If the target of the assignment is a column of a structured type, the source data type must be the target data type or a subtype of the target data type.

Array type assignments

The value for an element of an array must be assignable to the data type of the array elements. The assignment rules for that data type apply to the value assignment. The value specified for an index in the array must be assignable to the data type of the index for the array. The assignment rules for that data type apply to the value assignment. For an ordinary array, the index data type is INTEGER and for an associative array the data type is either INTEGER or VARCHAR(n), where n is any valid length attribute for the VARCHAR data type. If the index value for an assignment to an ordinary array is larger than the current cardinality of the array, then the cardinality of the array is increased to the new index value, provided the value does not exceed the maximum value for an INTEGER data type. An assignment of one new element to an associative array increases the cardinality by exactly 1 since the index values can be sparse.

The validity of an assignment to an SQL variable or parameter is determined according to the following rules:
  • If the right side of the assignment is an SQL variable or parameter, an invocation of the TRIM_ARRAY function, an invocation of the ARRAY_DELETE function, or a CAST expression, then its type must be the same as the type of the SQL variable or parameter on the left side of the assignment.
  • If the right side of the assignment is an array constructor or an invocation of the ARRAY_AGG function, then it is implicitly cast to the type of the SQL variable or parameter on the left side.
For example, assuming that the type of variable V is MYARRAY, the statement:
   SET V = ARRAY[1,2,3];
is equivalent to:
SET V = CAST(ARRAY[1,2,3] AS MYARRAY);
And the statement:
SELECT ARRAY_AGG(C1) INTO V FROM T
is equivalent to:
SELECT CAST(ARRAY_AGG(C1) AS MYARRAY) INTO V FROM T
The following are valid assignments that involve array type values:
  • Array variable to another array variable with the same array type as the source variable.
  • An expression of type array to an array variable, where the array element type in the source expression is assignable to the array element type in the target array variable.

Row type assignments

Assignments to fields within a row variable must conform to the same rules as if the field itself was a variable of the same data type as the field. A row variable can be assigned only to a row variable with the same user-defined row type. When using FETCH, SELECT, or VALUES INTO to assign values to a row variable, the source value types must be assignable to the target row fields. If the source or the target variable (or both) of an assignment is anchored to the row of a table or view, the number of fields must be the same and the field types of the source value must be assignable to the field types of the target value.

Cursor type assignments

Assignments to cursors depend on the type of cursor. The following values are assignable to a variable or parameter of built-in type CURSOR:
  • A cursor value constructor
  • A value of built-in type CURSOR
  • A value of any user-defined cursor type
The following values are assignable to a variable or parameter of a weakly typed user-defined cursor type:
  • A cursor value constructor
  • A value of built-in type CURSOR
  • A value of a user-defined cursor type with the same type name
The following values are assignable to a variable or parameter of strongly typed user-defined cursor type:
  • A cursor value constructor
  • A value of a user-defined cursor type with the same type name

Boolean type assignments

The following keywords represent values that are assignable to a variable, parameter, or return type of built-in type BOOLEAN:
  • TRUE
  • FALSE
  • NULL
The result of the evaluation of a search condition can also be assigned. If the search condition evaluates to unknown, the value of NULL is assigned.

Reference type assignments

A reference type with a target type of T can be assigned to a reference type column that is also a reference type with target type of S where S is a supertype of T. If an assignment is made to a scoped reference column or variable, no check is performed to ensure that the actual value being assigned exists in the target table or view defined by the scope.

Assignment to host variables is done based on the representation type of the reference type. That is, it follows the rule:
  • A value of a reference type on the right hand side of an assignment is assignable to a host variable on the left side if and only if the representation type of this reference type is assignable to this host variable.

If the target of the assignment is a column, and the right side of the assignment is a host variable, the host variable must be explicitly cast to the reference type of the target column.

Numeric comparisons

Numbers are compared algebraically; that is, with regard to sign. For example, -2 is less than +1.

If one number is an integer and the other is decimal, the comparison is made with a temporary copy of the integer, which has been converted to decimal.

When decimal numbers with different scales are compared, the comparison is made with a temporary copy of one of the numbers that has been extended with trailing zeros so that its fractional part has the same number of digits as the other number.

If one number is floating-point and the other is integer or decimal, the comparison is made with a temporary copy of the other number, which has been converted to double-precision floating-point.

Two floating-point numbers are equal only if the bit configurations of their normalized forms are identical.

If one number is decimal floating-point and the other number is integer, decimal, single precision floating-point, or double precision floating-point, the comparison is made with a temporary copy of the other number, which has been converted to decimal floating-point.

If one number is DECFLOAT(16) and the other number is DECFLOAT(34), the DECFLOAT(16) value is converted to DECFLOAT(34) before the comparison is made.

The decimal floating-point data type supports both positive and negative zero. Positive and negative zero have different binary representations, but the = (equal) predicate will return true for comparisons of negative and positive zero.

The COMPARE_DECFLOAT and TOTALORDER scalar functions can be used to perform comparisons at a binary level if, for example, a comparison of 2.0 <> 2.00 is required.

The decimal floating-point data type supports the specification of negative and positive NaN (quiet and signalling), and negative and positive infinity. From an SQL perspective, INFINITY = INFINITY, NAN = NAN, SNAN = SNAN, and -0 = 0.

The comparison and ordering rules for special values are as follows:
  • (+/-) INFINITY compares equal only to (+/-) INFINITY of the same sign.
  • (+/-) NAN compares equal only to (+/-) NAN of the same sign.
  • (+/-) SNAN compares equal only to (+/-) SNAN of the same sign.
The ordering among different special values is as follows:
  • -NAN < -SNAN < -INFINITY < 0 < INFINITY < SNAN < NAN

When string and numeric data types are compared, the string is cast to DECFLOAT(34) using the rules for a CAST specification. For more information, see the Related reference section at the end of this topic. The string must contain a valid string representation of a number.

String comparisons

Character strings are compared according to the collating sequence specified when the database was created, except those with a FOR BIT DATA attribute, which are always compared according to their bit values.

When comparing character strings of unequal lengths, the comparison is made using a logical copy of the shorter string, which is padded on the right with blanks sufficient to extend its length to that of the longer string. This logical extension is done for all character strings, including those tagged as FOR BIT DATA.

Character strings (except character strings tagged as FOR BIT DATA) are compared according to the collating sequence specified when the database was created. For example, the default collating sequence supplied by the database manager may give lowercase and uppercase versions of the same character the same weight. The database manager performs a two-pass comparison to ensure that only identical strings are considered equal to each other. In the first pass, strings are compared according to the database collating sequence. If the weights of the characters in the strings are equal, a second "tie-breaker" pass is performed to compare the strings on the basis of their actual code point values.

Two strings are equal if they are both empty or if all corresponding bytes are equal. If either operand is null, the result is unknown.

LOB strings that have an actual length less than 32673 bytes are now supported as operands in basic predicates, IN, BETWEEN and the simple CASE expression. In comparisons using the LIKE predicate, NULL predicate, and the POSSTR function, LOB strings of any length continue to be supported.

LOB strings are not supported in any other comparison operations such as MAX, MIN, DISTINCT, GROUP BY, and ORDER BY.

Portions of strings can be compared using the SUBSTR and VARCHAR scalar functions. For example, given the columns:
   MY_SHORT_CLOB   CLOB(300)
   MY_LONG_VAR     VARCHAR(8000)
then the following is valid:
   WHERE VARCHAR(MY_SHORT_CLOB) > VARCHAR(SUBSTR(MY_LONG_VAR,1,300))

Examples:

For these examples, 'A', 'Á', 'a', and 'á', have the code point values X'41', X'C1', X'61', and X'E1' respectively.

Consider a collating sequence where the characters 'A', 'Á', 'a', 'á' have weights 136, 139, 135, and 138. Then the characters sort in the order of their weights as follows:
'a' < 'A' < 'á' < 'Á'
Now consider four DBCS characters D1, D2, D3, and D4 with code points 0xC141, 0xC161, 0xE141, and 0xE161, respectively. If these DBCS characters are in CHAR columns, they sort as a sequence of bytes according to the collation weights of those bytes. First bytes have weights of 138 and 139, therefore D3 and D4 come before D2 and D1; second bytes have weights of 135 and 136. Hence, the order is as follows:
D4 < D3 < D2 < D1
However, if the values being compared have the FOR BIT DATA attribute, or if these DBCS characters were stored in a GRAPHIC column, the collation weights are ignored, and characters are compared according to their code points as follows:
 'A' < 'a' < 'Á' < 'á'
The DBCS characters sort as sequence of bytes, in the order of code points as follows:
D1 < D2 < D3 < D4
Now consider a collating sequence where the characters 'A', 'Á', 'a', 'á' have (non-unique) weights 74, 75, 74, and 75. Considering collation weights alone (first pass), 'a' is equal to 'A', and 'á' is equal to 'Á'. The code points of the characters are used to break the tie (second pass) as follows:
'A' < 'a' < 'Á' < 'á'
DBCS characters in CHAR columns sort a sequence of bytes, according to their weights (first pass) and then according to their code points to break the tie (second pass). First bytes have equal weights, so the code points (0xC1 and 0xE1) break the tie. Therefore, characters D1 and D2 sort before characters D3 and D4. Then the second bytes are compared in similar way, and the final result is as follows:
D1 < D2 < D3 < D4
Once again, if the data in CHAR columns have the FOR BIT DATA attribute, or if the DBCS characters are stored in a GRAPHIC column, the collation weights are ignored, and characters are compared according to their code points:
D1 < D2 < D3 < D4

For this particular example, the result happens to be the same as when collation weights were used, but obviously this is not always the case.

Conversion rules for comparison

When two strings are compared, one of the strings is first converted, if necessary, to the encoding scheme and code page of the other string.

Ordering of results

Results that require sorting are ordered based on the string comparison rules discussed in String comparisons. The comparison is performed at the database server. On returning results to the client application, code page conversion may be performed. This subsequent code page conversion does not affect the order of the server-determined result set.

MBCS considerations for string comparisons

Mixed SBCS/MBCS character strings are compared according to the collating sequence specified when the database was created. For databases created with default (SYSTEM) collation sequence, all single-byte ASCII characters are sorted in correct order, but double-byte characters are not necessarily in code point sequence. For databases created with IDENTITY sequence, all double-byte characters are correctly sorted in their code point order, but single-byte ASCII characters are sorted in their code point order as well. For databases created with COMPATIBILITY sequence, a compromise order is used that sorts properly for most double-byte characters, and is almost correct for ASCII. This was the default collation table in DB2® Version 2.

Mixed character strings are compared byte-by-byte. This may result in unusual results for multi-byte characters that occur in mixed strings, because each byte is considered independently.

Example:

For this example, 'A', 'B', 'a', and 'b' double-byte characters have the code point values X'8260', X'8261', X'8281', and X'8282', respectively.

Consider a collating sequence where the code points X'8260', X'8261', X'8281', and X'8282' have weights 96, 65, 193, and 194. Then:
   'B' < 'A' < 'a' < 'b'
and
   'AB' < 'AA' < 'Aa' < 'Ab' < 'aB' < 'aA' < 'aa' < 'ab'

Graphic string comparisons are processed in a manner analogous to that for character strings.

Graphic string comparisons are valid between all graphic string data types except DBCLOB.

For graphic strings, the collating sequence of the database is not used. Instead, graphic strings are always compared based on the numeric (binary) values of their corresponding bytes.

Using the previous example, if the literals were graphic strings, then:
   'A' < 'B' < 'a' < 'b'
and
   'AA' < 'AB' < 'Aa' < 'Ab' < 'aA' < 'aB' < 'aa' < 'ab'

When comparing graphic strings of unequal lengths, the comparison is made using a logical copy of the shorter string which is padded on the right with double-byte blank characters sufficient to extend its length to that of the longer string.

Two graphic values are equal if they are both empty or if all corresponding graphics are equal. If either operand is null, the result is unknown. If two values are not equal, their relation is determined by a simple binary string comparison.

As indicated in this section, comparing strings on a byte by byte basis can produce unusual results; that is, a result that differs from what would be expected in a character by character comparison. The examples shown here assume the same MBCS code page, however, the situation can be further complicated when using different multi-byte code pages with the same national language. For example, consider the case of comparing a string from a Japanese DBCS code page and a Japanese EUC code page.

Datetime comparisons

A date, time, or timestamp value can be compared with another value of the same data type, a datetime constant of the same data type, or with a string representation of a value of that data type. A date value or a string representation of a date can also be compared with a TIMESTAMP, where the missing time information for the date value is assumed to be all zeros. All comparisons are chronological, which means the further a point in time is from January 1, 0001, the greater the value of that point in time. The time 24:00:00 is greater than the time 00:00:00.

Comparisons that involve time values and string representations of time values always include seconds. If the string representation omits seconds, zero seconds are implied.

Comparisons that involve timestamp values are evaluated according to the following rules:
  • When comparing timestamp values with different precisions, the higher precision is used for the comparison and any missing digits for fractional seconds are assumed to be zero.
  • When comparing a timestamp value with a string representation of a timestamp, the string representation is first converted to TIMESTAMP(12).
  • Timestamp comparisons are chronological without regard to representations that might be considered equivalent. Thus, the following predicate is true:
    TIMESTAMP('1990-02-23-00.00.00') > '1990-02-22-24.00.00'

User-defined type comparisons

Information about comparisons involving user-defined types is in the sections that follow.

Strongly typed distinct type comparisons

Values with a strongly typed distinct type only can be compared with values of exactly the same strongly typed distinct type.

For example, given the following YOUTH distinct type and CAMP_DB2_ROSTER table:
  CREATE TYPE YOUTH AS INTEGER

  CREATE TABLE CAMP_DB2_ROSTER
     ( NAME                VARCHAR(20),
       ATTENDEE_NUMBER     INTEGER NOT NULL,
       AGE                 YOUTH,
       HIGH_SCHOOL_LEVEL   YOUTH)
The following comparison is valid:
  SELECT * FROM CAMP_DB2_ROSTER
     WHERE AGE > HIGH_SCHOOL_LEVEL
The following comparison is not valid:
  SELECT * FROM CAMP_DB2_ROSTER
     WHERE AGE > ATTENDEE_NUMBER
However, AGE can be compared to ATTENDEE_NUMBER by using a function or CAST specification to cast between the distinct type and the source type. The following comparisons are all valid:
  SELECT * FROM CAMP_DB2_ROSTER
     WHERE INTEGER(AGE) > ATTENDEE_NUMBER

  SELECT * FROM CAMP_DB2_ROSTER
     WHERE CAST( AGE AS INTEGER) > ATTENDEE_NUMBER

  SELECT * FROM CAMP_DB2_ROSTER
     WHERE AGE > YOUTH(ATTENDEE_NUMBER)

  SELECT * FROM CAMP_DB2_ROSTER
     WHERE AGE > CAST(ATTENDEE_NUMBER AS YOUTH)

Weakly typed distinct type comparisons

Values with a weakly typed distinct type are compared according to the comparison rules for the source type of the distinct type. Data type constraints have no impact on the comparison. It is valid to compare different weakly typed distinct types if the underlying source types are comparable.

Structured type comparisons

Values with a user-defined structured type cannot be compared with any other value (the NULL predicate and the TYPE predicate can be used).

Array type comparisons

Comparisons of array type values are not supported. Elements of arrays can be compared based on the comparison rules for the data type of the elements.

Row type comparisons

A row variable cannot be compared to another row variable even if the row type name is the same. Individual fields within a row type can be compared to other values and the comparison rules for the data type of the field apply.

Cursor type comparisons

A cursor variable cannot be compared to another cursor variable even if the cursor type name is the same.

Boolean type comparisons

A Boolean value can be compared with a Boolean literal value. A value of TRUE is greater than a value of FALSE.

Reference type comparisons

Reference type values can be compared only if their target types have a common supertype. The appropriate comparison function will only be found if the schema name of the common supertype is included in the SQL path. The comparison is performed using the representation type of the reference types. The scope of the reference is not considered in the comparison.

XML comparisons in a non-Unicode database

When performed in a non-Unicode database, comparisons between XML data and character or graphic string values require a code page conversion of one of the two sets of data being compared. Character or graphic values used in an SQL or XQuery statement, either as a query predicate or as a host variable with a character or graphic string data type, are converted to the database code page before comparison. If any characters included in this data have code points that are not part of the database code page, substitution characters are added in their place, potentially causing unexpected results for the query.

For example, a client with a UTF-8 code page is used to connect to a database server created with the Greek encoding ISO8859-7. The expression ΣGΣM is sent as the predicate of an XQuery statement, where ΣG represents the Greek sigma character in Unicode (U+03A3) and ΣM represents the mathematical symbol sigma in Unicode (U+2211). This expression is first converted to the database code page, so that both "Σ" characters are converted to the equivalent code point for sigma in the Greek database code page, 0xD3. We may denote this code point as ΣA. The newly converted expression ΣAΣA is then converted again to UTF-8 for comparison with the target XML data. Since the distinction between these two code points was lost as a result of the code page conversion required to pass the predicate expression into the database, the two initially distinct values ΣG and ΣM are passed to the XML parser as the expression ΣGΣG. This expression then fails to match when compared to the value ΣGΣM in an XML document.

One way to avoid the unexpected query results that may be caused by code page conversion issues is to ensure that all characters used in a query expression have matching code points in the database code page. Characters that do not have matching code points can be included through the use of a Unicode character entity reference. A character entity reference will always bypass code page conversion. For example, using the character entity reference &#2211; in place of the ΣM character ensures that the correct Unicode code point is used for the comparison, regardless of the database code page.