Topic
  • 1 reply
  • Latest Post - ‏2012-10-09T17:22:11Z by mor
mortenb
mortenb
49 Posts

Pinned topic IXF utf-8 character issues

‏2012-10-09T09:30:50Z |
I assumed the IXF format was binary so a text column should be identical to what was exported when it was imported.
and independent of the export shell config.

If the codepage is different between exporter and importer the text/char columns
show strange characters for utf-8 characters after being imported.

db2 "export to '/usr/tmp/test.ixf' of IXF select * from tab"
db2 "import from '/usr/tmp/test.ixf' of IXF insert into tab"

ex: export:'NOR_TEST_JØRGEN' shows like this on import:'NOR_TEST_J▒RGEN'
and the database is corrupted. All relevant db configs are of course the same.

Are there some guidelines for this? How to properly export/import text/character columns in DB2.
Or are there some tricks I've not discovered.

Thanks
MortenB
  • mor
    mor
    577 Posts

    Re: IXF utf-8 character issues

    ‏2012-10-09T17:22:11Z  
    Every time a program connects to a DB2 LUW database, that connection has an "application code page".

    Depending on the db2-client operating-system, there are different ways for DB2 to determine that application code-page. For example on Linux /Unix it is the value of the environment variable LANG and the installed locale(s) that determine the application code-page for bash/ksh shell connections. For Windows, depending on the version of Windows, there are a couple of ways that DB2 can determine the application-code page.

    Every DB2 LUW database also has its own codepage, territory and code set.

    Whenever the "application code page" differs from the database code page then you will get a "code page conversion", which may result in the kind of symptoms that you discuss (although there are also other reasons). This is true even the the client and db2-server are both running on the same machine.

    You should aim where possible to ensure that the application codepage is the same as the database codepage - because then there will be no codepage conversions. But sometimes codepage conversion is inevitable in distributed global / heterogenous environments, but in that case you ensure that both of the codepages have a unicode encoding of some sort.

    As usual, careful study of the DB2 Infocenter should give you the information you need, and the in the case that it does not explain, then raise a feedback with IBM from the Infocenter itself.