Graphic strings
A graphic string is a sequence of code units that represents double-byte character data.
The length of the string is the number of code units in the sequence. If the length is zero, the value is called the empty string. This value should not be confused with the null value.
Graphic strings are not supported in a database that is defined with a single-byte code page.
Graphic strings are not checked to ensure that their values contain only double-byte character code points. (The exception to this rule is an application that is precompiled with the WCHARTYPE CONVERT option. In this case, validation does occur.) Rather, the database manager assumes that double-byte character data is contained in graphic data fields. The database manager does check that a graphic string value is an even number of bytes long.
NUL-terminated graphic strings that are found in C are handled differently, depending on the standards level of the precompile option. This data type cannot be created in a table. It can be used only to insert data into and retrieve data from the database.
Fixed-length graphic strings (GRAPHIC)
All values in a fixed-length graphic string column have the same length, which is determined by the length attribute of the column. The length attribute must be 1 - 127, inclusive, unless the string unit is CODEUNITS32 which has a range of 1 - 63, inclusive.
Varying-length graphic strings
- A VARGRAPHIC value can be up to 16 336 double-byte code units long. If the string unit is CODEUNITS32, the length can be up to 8 168 string units.
- A DBCLOB (double-byte character large object) value can be up to 1 073 741 823 double-byte code units long. If the string unit is CODEUNITS32, the length can be up to 536 870 911 string units. A DBCLOB is used to store large DBCS character-based data (such as documents written with a single character set) and, therefore, has a DBCS code page that is associated with it.
Special restrictions apply to an expression that results in a DBCLOB data type. These restrictions are the same as the restrictions specified in Varying-length character strings.
String units specification for graphic strings
- Double bytes
- Indicates that the units for the length attribute are double bytes. This unit of length applies to all graphic string data types in a non-Unicode database. In a Unicode database, CODEUNITS16 is used.
- CODEUNITS16
- Indicates that the units for the length attribute are Unicode UTF-16 code units which are the same as counting in double bytes. This unit of length does not affect the underlying code page of the data type. A string unit of CODEUNITS16 can be used only with graphic string data types in a Unicode database. CODEUNITS16 can be explicitly specified or determined based on an environment setting.
- CODEUNITS32
- Indicates that the units for the length attribute are Unicode UTF-32 code units which approximate counting in characters. This unit of length does not affect the underlying code page of the data type. The actual length of a data value is determined by counting the UTF-32 code units as if the data was converted to UTF-32. A string unit of CODEUNITS32 can be used only in a Unicode database. CODEUNITS32 can be explicitly specified or determined based on an environment setting.
In a non-Unicode database, the string unit is always double bytes and cannot be changed. In a Unicode database, the string units can be explicitly specified with the length attribute of a graphic string data type, or it can default based on an environment setting.
The environment setting for string units is based on the value for the NLS_STRING_UNITS global variable, or the string_units database configuration parameter. The database configuration parameter can be set to either SYSTEM or CODEUNITS32. The global variable can also be set to either SYSTEM or CODEUNITS32, but also can be set to NULL. The NULL value indicates that the SQL session should use the string_units database configuration parameter setting. If the value for the environment setting is SYSTEM, then CODEUNITS16 is used as the default string units setting in a Unicode database and double bytes is used in a non-Unicode database.