Data Manipulation Language effects

Using the nchar and nvarchar character sets affects ordering, sorting, comparing, joining, and aggregating.

  • ORDER BY and GROUP BY

    The collation is binary.

  • Comparisons

    All comparisons, which include joining, filtering, general expressions, and those implicit in grouping and aggregations, are done through binary collation.

  • JOIN

    Join comparisons use binary collation.

  • Aggregations

    You can do min, max, and distinct aggregations of character columns through binary collation.

  • Casting

    You can cast between char and nchar data types. Since every character that can be represented in char can also be represented in nchar, casting from char to nchar is lossless. For example, cast (<char-col> as nchar(10)). Casting from nchar to char might have characters that cannot be represented in a char column since nchar can store all Unicode and char stores Latin-9. Characters that do not have a Latin-9 representation are converted to a question mark.

  • SQL pattern matching

    You can do standard like predicate pattern matching on nchar class data.

  • Mixing char and nchar

    You can join across char and nchar columns.

  • String functions for nchar data

    You can use the standard string manipulation functions, such as to_char, to_date, upper, and lower. For more information about these functions, see Table 1 and Table 1.

    Note: String conversion functions such as upper and lower do not use the Unicode Organization SpecialCasing rules; instead, they use the rules in UnicodeData.txt. The SpecialCasing rules sometimes cause unwanted results, such as altering the length of a string.

Characters in (var)char and n(var)char columns can be compared and joined. Netezza Performance Server performs the required encoding conversions. So, for example an À in a char column and an À in an nchar column compare as equal, even though they have different encodings. Similarly data from a char column may be inserted into an nchar column. The reverse is true also, but since Unicode can represent many thousands of characters, and Latin-9 only about 200, many characters that could be in an nchar column are not representable in a char column. These characters are replaced with a question mark in the char column.