Code page conversion string-length overflow in mixed code set environments

In EUC and DBCS unequal code page environments, situations can occur after conversion takes place when there is not enough space allocated in a column to accommodate the entire string. In this case, the maximum expansion will be twice the length of the string in bytes. In cases where expansion does exceed the capacity of the column, SQLCODE -334 (SQLSTATE 22524) is returned.

This leads to situations that might not be immediately obvious or previously considered as follows:
  • An SQL statement cannot be longer than 32 765 bytes in length. If the statement is complex enough or uses enough constants or database object names that can be subject to expansion upon conversion, this limit might be reached earlier than expected.
  • SQL identifiers are allowed to expand on conversion up to their maximum length, which is eight bytes for short identifiers and 128 bytes for long identifiers.
  • Host language identifiers are allowed to expand on conversion up to their maximum length, which is 255 bytes.
  • When the character fields in the SQLCA structure are converted, they are allowed to expand to no more than their maximum defined length.
When you design applications for mixed code set environments, you should refer to the appropriate documentation if you have any of the following situations:
  • Corresponding string columns in full selects with set operations (UNION, INTERSECT and EXCEPT)
  • Operands of concatenation
  • Operands of predicates (with the exception of LIKE)
  • Result expressions of a CASE statement
  • Arguments of the scalar function COALESCE (and VALUE)
  • Expression values of the IN list of an IN predicate
  • Corresponding expressions of a multiple row VALUES clause

In these situations, conversions might occur according to the application code page instead of the database code page.

Other situations that you need to consider are those in which the character conversion results in a string length beyond the limit for the data type, and code page conversions in stored procedures:
  • Character conversion past a data type limit

    In EUC and DBCS unequal code page environments, situations can occur after conversion takes place in which the length of the mixed character or graphic string exceeds the maximum length allowed for that data type. If the length of the string, after expansion, exceeds the limit of the data type, type promotion does not occur. Instead, an error message is returned indicating that the maximum allowed expansion length has been exceeded. This situation is more likely to occur while evaluating predicates than inserts. With inserts, the column width is more readily known by the application, and the maximum expansion factor can be readily taken into account. In many cases, this side effect of character conversion can be avoided by casting the value to an associated data type with a longer maximum length. For example, the maximum length of a CHAR value is 255 bytes, while the maximum length of a VARCHAR is 32 672 bytes. In cases where expansion does exceed the maximum length of the data type, SQLCODE -334 (SQLSTATE 22524) is returned.

  • Code page conversion in a stored procedure

    Mixed character or graphic data specified in host variables and SQLDAs in sqleproc() or SQL CALL invocations are converted in situations where the application and database code pages are different. In cases where string length expansion occurs as a result of conversion, you receive an SQLCODE -334 (SQLSTATE 22524) if there is not enough space allocated to handle the expansion. Thus you must be sure to provide enough space for potentially expanding strings when developing stored procedures. You should use variable-length data types with enough space allocated to allow for expansion.