Simple XML Parser

The Simple XML Parser reads and writes XML documents; it deals with XML data which is not more than two levels deep. You can use the information and links provided here to know more about simple XML parser.

This Parser uses the Apache Xerces and Xalan libraries. The Parser gives access to XML document through a script object called xmldom. The xmldom object is an instance of the org.w3c.dom.Document interface. Refer to http://docs.oracle.com/javase/6/docs/api/ for a complete description of this interface.

You can also use the XPathAPI (http://xml.apache.org/xalan-j/apidocs/index.html and access its Java™ Classes in your Scripts) to search and select nodes from the XML document. selectNodeList, a convenience method in the system object, can be used to select a subset from the XML document.

When the Connector is initialized, the Simple XML Parser tries to perform Document Type Definition (DTD) verification if a DTD tag is present.

Use the Connector's override functions to interpret or generate the XML document yourself. Create the necessary script in either the Override GetNext or GetNext Successful in your AssemblyLine's hook definitions. If you do not override, the Parser reads or writes a very simple XML document that mimics the entry object model. The default Parser only permits you to read or write XML files two levels deep. It will also read multi-valued attributes, although only one of the multi-value attributes will be shown when browsing the data in the Schema tab.

Note that certain methods, such as setAttribute are available in both the IBM Security Directory Integrator entry and the objects returned by xmldom.createElement. These functions have the same name or signature. Do not confuse the xmldom objects with the IBM Security Directory Integrator objects.

Note:
  1. This Parser was called "XML Parser" in pre-IBM Security Directory Integrator 7.0 releases. In IBM Security Directory Integrator 7.0 it is renamed to Simple XML Parser and a new XML Parser was added; see XML Parser. The new Parser has a lot of improvements and is now the main IBM Security Directory Integrator XML Parser.
  2. If you read large (more than 4MB) or write large (more than 14MB) XML files, your Java VM may run out of memory. Refer to java.lang.OutOfMemoryError exception in the Troubleshooting section for a solution to this. Alternatively, use the XML Parser or the XML SAX Parser.
  3. The Parser silently ignores empty entries.
  4. When reading a CDATA attribute, no blank space is trimmed from the value. However, blank space is trimmed from attributes that are not CDATA.
  5. Certain characters, such as $, are illegal in XML tags. Avoid these characters in your attribute names when using the XML Parser because these characters might create illegal XML.
  6. When reading from an LDAP directory or an LDIF file, the distinguished name (DN) is typically returned in an attribute named $dn. If you map this attribute without changing the name into an XML file, it fails because $dn is not a legal tag in an XML document. If you do explicit mapping, you must change "$dn" to "dn" (or something without a special character) in your output Connector. If you do implicit mapping, for example, * or Automatically map all attributes checked in the AssemblyLine Settings (through the Config . . . tab of the AssemblyLine), you can configure the XML Parser to translate the distinguished name (for example, $dn) to a different name. For example, you can add something like this in the Before GetNext Hook:
    conn.setAttribute("dn", work.getAttribute("$dn")); 
    conn.removeAttribute("$dn");