Delimiter considerations for moving data
When moving delimited ASCII (DEL) files, it is important to ensure that the data being moved is not unintentionally altered because of problems with delimiter character recognition. To help prevent these errors, Db2® enforces several restrictions and provides a number of file type modifiers.
Delimiter restrictions
There are a number of restrictions in place that help prevent the chosen delimiter character from being treated as a part of the data being moved. First, delimiters are mutually exclusive. Second, a delimiter cannot be binary zero, a line-feed character, a carriage-return, or a blank space. As well, the default decimal point (.) cannot be a string delimiter. Finally, in a DBCS environment, the pipe (|) character delimiter is not supported.
- The Shift-In (0x0F) and the Shift-Out (0x0E) character cannot be delimiters for an EBCDIC MBCS data file.
- Delimiters for MBCS, EUC, or DBCS code pages cannot be greater than 0x40, except the default decimal point for EBCDIC MBCS data, which is 0x4b.
- Default delimiters for data files in ASCII code pages or EBCDIC
MBCS code pages are:
- string delimiter:
"
(0x22, double quotation mark) - column delimiter:
,
(0x2c, comma)
- string delimiter:
- Default delimiters for data files in EBCDIC SBCS code pages are:
- string delimiter:
"
(0x7F, double quotation mark) - column delimiter:
,
(0x6B, comma)
- string delimiter:
- The default decimal point for ASCII data files is 0x2e (period).
- The default decimal point for EBCDIC data files is 0x4B (period).
- If the code page of the server is different from the code page
of the client, it is recommended that the hex representation of non-default
delimiters be specified. For example,
db2 load from ... modified by chardel0x0C coldelX1e ...
Issues with delimiters during data movement
- Double character delimiters
By default, for character-based fields of a DEL file, any instance of the character delimiter found within the field is represented by double character delimiters. For example, assuming that the character delimiter is the double quote, if you export the text I am 6" tall., the output text in the DEL file reads "I am 6"" tall." Conversely, if the input text in a DEL file reads "What a ""nice"" day!", the text is imported as What a "nice" day!
- nodoubledel
- Double character delimiter behavior can be disabled for the import,
export, and load utilities by specifying the
nodoubledel
file type modifier. However, be aware that double character delimiter behavior exists in order to avoid parsing errors. When you usenodoubledel
with export, the character delimiter is not doubled if it is present in character fields. When you usenodoubledel
with import and load, the double character delimiter is not interpreted as a literal instance of the character delimiter. - nochardel
- When you use the
nochardel
file type modifier with export, the character fields are not surrounded by character delimiters. Whennochardel
is used import and load, the character delimiters are not treated as special characters and are interpreted as actual data. - chardel
- Other file type modifiers can be used to manually prevent confusion
between default delimiters and the data. The
chardel
file type modifier specifies x, a single character, as the character string delimiter to be used instead of double quotation marks (as is the default). - coldel
- Similarly, if you wanted to avoid using the default comma as a
column delimiter, you could use
coldel
, which specifies x, a single character, as the column data delimiter. - delprioritychar
- Another concern in regards to moving DEL files is maintaining
the correct precedence order for delimiters. The default priority
for delimiters is: row, character, column. However, some applications
depend on the priority: character, row, column. For example, using
the default priority, the DEL data file:
would be interpreted as having two rows: Vincent, and is a manager, since <row delimiter> takes precedence over the character delimiter ("). Using"Vincent <row delimiter> is a manager",<row delimiter>
delprioritychar
gives the character delimiter (") precedence over the row delimiter (<row delimiter>), meaning that the same DEL file would be interpreted (correctly) as having one row: Vincent is a manager.