IBM Content Analytics with Enterprise Search, Version 3.0.0     Operating systems:  Linux         

Preparing Linux

Before you crawl and index documents, ensure that the default encoding value of the Linux operating system matches the language encoding of the documents that you want to crawl and index.

On Linux operating systems (including Linux on System z®), IBM® Content Analytics with Enterprise Search cannot properly crawl and index documents for which the file names are encoded with a character encoding other than the default encoding value of the operating system. To ensure that the system properly crawls and indexes documents on Linux operating systems, change the file name encoding format of the documents to be the same as the default encoding value of the Linux operating system. Alternatively, you can change the default encoding value of the Linux operating system to the language encoding of the documents that you want to crawl and index.

To change the default encoding value of the Linux operating system:

  1. In the /etc/sysconfig/i18n file, set the LANG property to the language encoding of the documents that you want to crawl and index. For example, set LANG=zh_CN.GB18030 for Chinese documents.
  2. To register the new encoding setting, run the command source /etc/sysconfig/i18n or restart the computer.

Feedback

Last updated: May 2012

© Copyright IBM Corporation 2004, 2012.
This information center is powered by Eclipse technology. (http://www.eclipse.org)