IBM Support

Invalid XML character error

Question & Answer


Question

I was saving item data (or running migrateDataToXml.sh) and got an error(s) saying invalid XML character. I have never received this error before. Why am I getting these errors and how can I fix them?

Cause

From version 10.0.0 Fix Pack 1 onwards, IBM InfoSphere Master Data Management Collaboration Server uses XML format to store item data in the database. XML has universal standards and only supports a clearly defined set of Unicode characters: http://en.wikipedia.org/wiki/Valid_characters_in_XML.

If I try to save an item with an invalid XML character in one of the attributes, then the application will not be able to store the item attribute in XML format. This can happen during:
1. Entering data into the system e.g. data entry in the user interface or import jobs
2. Running scripts like $TOP/bin/migration/migrateDataToXml.sh. This script is run during migration to any post 10.0.0 Fix Pack 1 version of the product to create XML format of all existing data. If this script finds any invalid characters while creating XML format, it will give an error stack similar to the following.

Error Stack for Oracle:
java.sql.SQLException: ORA-31061: XDB error: XML event error
ORA-19202: Error occurred in XML processing
In line 240 of orastream:
LPX-00217: invalid character 19 (U+0013)

at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:879)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:450)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:207)
at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:1044)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1329)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3584)
at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:3665)
at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeUpdate(OraclePreparedStatementWrapper.java:1352)
at com.ibm.ccd.common.util.oracle.OracleUtils.mergeXML(Unknown Source)
at com.ibm.ccd.content.common.EntryXmlProcessor.sendPendingXmlToDb(Unknown Source)
at com.ibm.ccd.common.context.common.DBContext.ensurePendingXMLWrittenToDB(Unknown Source)
at com.ibm.ccd.common.context.common.DBContext.commit(Unknown Source)
at com.ibm.ccd.common.context.common.DBContext.commit(Unknown Source)

Answer

To correct this error, remove the invalid character from the item attribute. Since only a certain type of attribute can hold these characters, so we can ignore a lot of attribute types like integers, enumerations, lookup tables, sequences etc. These bad attributes are probably from string type attributes and may contain long strings or external URL. These invalid attributes can be introduced with a feed file import or be present in the database from previous versions.

The following list shoes some of the ways to detect them:
1. Open the item and browse through the attributes to look for the invalid character.
2. Find the attribute through trial and error. This can be done by removing an attribute and save the item. If it gives the same error then the attribute is good and so paste the value back. Repeat this process till you able to save the item and the last removed attribute will be the one with the bad attribute.
3. Use the following script to print out the item and browse through it to look for attributes which might be a candidate for invalid characters:
var ctg;
ctg = getCtgByName("Catalog_Name");
if(ctg != null)
{
out.writeln(ctg1.getEntryByPrimaryKey("pk1"));
out.writeln(ctg1.getEntryByPrimaryKey("pk2"));
....
}
4. You may also use any third party XML tool to detect the invalid character.

To fix this attribute, you may:
1. Open the item in the user interface, remove the bad character from the attribute and save the item.
2. Use SQL, WQL, or a script to delete the attribute value and set it to NULL.
3. If there are lots of items, then write an export job to export values of all the item(s). Then edit the bad attribute values, and import these values again using a feed file or an import job.

[{"Product":{"code":"SS2U2U","label":"InfoSphere Master Data Management Collaboration Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"--","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"}],"Version":"10.1;10.0","Edition":"All Editions","Line of Business":{"code":"","label":""}}]

Product Synonym

WPC ;MDMCS;MDMPIM;MDM Server for PIM;WebSphere Product Center;InfoSphere MDM Server for PIM;Infosphere Master Data Management Collaboration Server;InfoSphere Master Data Management Server for Product Information Management

Document Information

Modified date:
16 June 2018

UID

swg21619894