IBM Support

Working with Document Keyword Replace service to replace characters on your documents

Technical Blog Post


Abstract

Working with Document Keyword Replace service to replace characters on your documents

Body

Document Keyword replace service is a service that allows you to replace or delete any character or string of characters from your Primary Document.

This blog aims to provide detailed examples of the service usage.

 For detailed documentation of all the service parameters, see the Document Keyword Replace Service documentation.

Most important parameters:

Keyword<x> - Use the keyword parameter to specify which characters to replace.
                         You can add multiple instances of this parameter in numeric order if you want to perform multiple replacements.
                         For example: keyword1, keyword2, keyword3, etc...
Replace<x> - Use the replace parameter to specify the characters or strings that will be replacing the existing ones.
                       You need to specify as many replace parameters as the amount of keyword parameters specified.
                       For example: replace1, replace2, replace3, etc...

 Keywordtype<x> and replacetype<x> - Use these parameters to specify the type of characters being replaced or are replacing the existing ones.

                                                                    You can specify if the characters are a string or an hex value and you can also specify the encoding type if necessary.

Note: You cannot use wildcards when specifying the keyword to be replaced.

Examples of the service usage:

---------------Simple string replacement---------------

The example below replaces all instances of "test" with "prod" and replaces all instances of ABC with X

<operation name="DocKeywordReplace">
<participant name="MyDocKeywordReplace"/>
<output message="outmsg">
<assign to="." from="*"></assign>
<assign to="literal_mode">true</assign>
<assign to="literal_bufferSize">102400</assign>
<assign to="literal_readAheadSize">8192</assign>
<assign to="keyword1">test</assign>
<assign to="replace1" from="string('prod')"></assign>
<assign to="keyword2">ABC</assign>
<assign to="replace2" from="string('X')"></assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

The document before the mentioned replacements, looks like:

"This is a test connection file on system ABC"

The document after the mentioned replacements, looks like:

"This is a prod connection file on system X"

---------------Simple string deletion---------------

The example below deletes all instances of 1234 and ABC.

<operation name="DocKeywordReplace">

<participant name="MyDocKeywordReplace"/>
<output message="outmsg">
<assign to="." from="*"></assign>
<assign to="literal_mode">true</assign>
<assign to="literal_bufferSize">102400</assign>
<assign to="literal_readAheadSize">8192</assign>
<assign to="keyword1">1234</assign>
<assign to="replace1" from="string('')"></assign>
<assign to="keyword2">ABC</assign>
<assign to="replace2" from="string('')"></assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

The document before the mentioned replacements, looks like:

"This is document number 21234 of system ABCD.
System ABCprod"

The document after the mentioned replacements, looks like:

"This is document number 2 of system D.
System prod"

---------------Delete all CRLF---------------

The example below deletes all instances of carriage returns/line feeds (CRLF), making the Primary Document a streamed document.

The hexadecimal representation of CRLF is 0d0a. To specify a hexadecimal value directly in the keyword parameter, you can use its xml representation, for example: &#x0d; (for the Carriage Return character).

<operation name="DocKeywordReplace">

<participant name="MyDocKeywordReplace"/>
<output message="outmsg">
<assign to="." from="*"></assign>
<assign to="literal_mode">true</assign>
<assign to="literal_bufferSize">102400</assign>
<assign to="literal_readAheadSize">8192</assign>
<assign to="keyword1" from="string('&#x0d;&#x0a;')"></assign>
<assign to="replace1" from="string('')"></assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

The document before the mentioned replacements, looks like:

"thisIsATestFile
ContainingCRLF"

The document after the mentioned replacements, looks like:

"thisIsATestFileContainingCRLF"

---------------Replacing LF with CRLF---------------

The example below replaces all instances of LF with CRLF, but this time it uses the keywordtype and replacetype to specify that we are using hexadecimal values.

This is useful when you are replacing invalid XML characters, which means that you cannot use their XML representation directly in the keyword parameter.

<operation name="DocKeywordReplace">

<participant name="MyDocKeywordReplace"/>
<output message="outmsg">
<assign to="." from="*"></assign>
<assign to="literal_mode">true</assign>
<assign to="literal_bufferSize">102400</assign>
<assign to="literal_readAheadSize">8192</assign>
<assign to="keyword1" from="string('0a')"></assign>
<assign to="keywordtype1">hex</assign>
<assign to="replace1" from="string('0d0a')"></assign>
<assign to="replacetype1">hex</assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

---------------Obtain the status report of the service in ProcessData---------------

The example below will place the service status report into the processData tag specified with the amount of replaces the service executed; in this case under StatusReport/STATRPT.

<operation name="DocKeywordReplace">

<participant name="MyDocKeywordReplace"/>
<output message="outmsg">
<assign to="." from="*"></assign>
<assign to="literal_mode">true</assign>
<assign to="literal_bufferSize">102400</assign>
<assign to="literal_readAheadSize">8192</assign>
<assign to="keyword1">!</assign>
<assign to="replace1" from="string(' ')"></assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
<assign to="StatusReport" from="Status_Rpt('STATRPT')"></assign>
</input>
</operation>

Example of generated ProcessData:

<StatusReport>
<STATRPT><![CDATA[literal_mode: Processed 19 bytes with 4 replacements in 0 seconds.]]></STATRPT>
</StatusReport>

---------------Replace xml representation with the character itself---------------

In this example we will be replacing all XML representation characters (&quot; &apos; &lt; &gt; &amp;) with the direct characters (" ' < > &).

<operation name="DocKeywordReplace">

<participant name="MyDocKeywordReplace"/>
<output message="outmsg">
<assign to="." from="*"></assign>
<assign to="literal_mode">true</assign>
<assign to="literal_bufferSize">102400</assign>
<assign to="literal_readAheadSize">8192</assign>
<assign to="keyword1" from="'&amp;quot;'"></assign>
<assign to="replace1" from="string('22')"></assign>
<assign to="replacetype1">hex</assign>
<assign to="keyword2" from="'&amp;apos;'"></assign>
<assign to="replace2" from="string('27')"></assign>
<assign to="replacetype2">hex</assign>
<assign to="keyword3" from="'&amp;lt;'"></assign>
<assign to="replace3" from="string('3C')"></assign>
<assign to="replacetype3">hex</assign>
<assign to="keyword4" from="'&amp;gt;'"></assign>
<assign to="replace4" from="string('3E')"></assign>
<assign to="replacetype4">hex</assign>
<assign to="keyword5" from="'&amp;amp;'"></assign>
<assign to="replace5" from="string('26')"></assign>
<assign to="replacetype5">hex</assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

Passing a file to this BP containing &quot; &apos; &lt; &gt; &amp; will extract a file containing " ' < > &

---------------Replace characters with its xml representation---------------

Special characters (such as <, >, &, ", and ' ) can be replaced in XML documents with their XML representations using the DocumentKeywordReplace service.

However, since XML representations used within BPML are converted to the appropriate character, the string mode of DocumentKeywordReplace will not work in this instance.

In other words, if you try to replace string('&lt;') with string('<'), the BPML interpreter will change the &lt; to the < character before doing the replace.

So, in effect, it replaces '<' with '<'.

In order to make the replacement work correctly, the hex mode of DocumentKeywordReplace must be utilized.

The following service configuration will replace the < character with the appropriate ASCII characters of the XML representation &lt; (26 = &, 6C = l, 74 = t, and 3B = ;).

<operation name="DocKeywordReplace">

<participant name="MyDocKeywordReplace"/>
<output message="outmsg">
<assign to="." from="*"></assign>
<assign to="literal_mode">true</assign>
<assign to="literal_bufferSize">102400</assign>
<assign to="literal_readAheadSize">8192</assign>
<assign to="keyword1" from="string('&lt;')"></assign>
<assign to="replace1">266C743B</assign>
<assign to="replacetype1">hex</assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

Passing a file to this BP containing '<<<' will extract a file containing '&lt;&lt;&lt;'

---------------Use the keyword type to specify the encoding---------------

For some characters, even though you have specified the characters correctly and the primary document contains the characters to be replaced, the service does not perform any replacement.

 For this type of issues, try specifying the encoding of the string to be replaced.

You can achieve this by using the keywordtype.

<operation name="DocKeywordReplace">

<participant name="MyDocKeywordReplace"/>
<output message="outmsg">
<assign to="." from="*"></assign>
<assign to="literal_mode">true</assign>
<assign to="literal_bufferSize">102400</assign>
<assign to="literal_readAheadSize">8192</assign>
<assign to="keyword1" from="string('&#xC1;')"/>
<assign to="keywordtype1">UTF-8</assign>
<assign to="replace1">A</assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

---------------Replacing invalid XML Characters---------------

Some characters like the bell character (Hex 07) are not valid XML characters.

Using <assign to="keyword1" from="string('&#x07;')"></assign> is the correct procedure for assigning most hex values to a keyword, but it will not work for the bell character.

The bell character is invalid in XML, thus no replacement will be done if the usual procedure for assigning hex values to keywords is followed, you will need to use the keywordtype parameter.

<operation name="DocKeywordReplace">

<participant name="MyDocKeywordReplace"/>
<output message="outmsg">
<assign to="." from="*"></assign>
<assign to="literal_mode">true</assign>
<assign to="literal_bufferSize">102400</assign>
<assign to="literal_readAheadSize">8192</assign>
<assign to="keyword1" from="string('07')"/>
<assign to="keywordtype1">hex</assign>
<assign to="replace1">A</assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

--------------Removing a XML prolog---------------

Some XML documents contain an XML prolog (or preamble) that might cause issues when processing the file.

It is possible to remove the prolog from an XML document using Document Keyword Replace, but you must make sure that the attributes in the prolog exactly match.

The following Document Keyword Replace service will remove the standard prolog created by the DOMToDoc function.

 If the prolog looks like this:

<?xml version="1.0" encoding="UTF-8">

This Document Keyword Replace service configuration will remove it:

<operation name="Document Keyword Replace">

<participant name="DocKeywordReplace"/>
<output message="DocKeywordReplaceInputMessage">
<assign to="literal_mode">true</assign>
<assign to="keyword1" from="string(&apos;&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;&apos;)"></assign>
<assign to="replace1" from="string('')"></assign>
<assign to="." from="*"></assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

---------------Replacing a string with the contents of another document---------------

Using Document mode with the "Document Keyword Replace Service" allows you to replace a string from the PrimaryDocument with the complete contents of another document in ProcessData.

 For that to work you will need to use the keyend and keystart parameters on the service BPML configuration or use the service default keystart and keyend, see examples below.

 Note: The parameter replace1 should point to the node on ProcessData where to find the document that you want to replace your string with.

Example 1: Assigning a specific keystart and keyend

Assuming that the String to be replaced on your PrimaryDocument looks like: ThisIsTheStringToBeReplaced

From that string you will need to assign a keystart, a keyend and a keyword, in this example we will assign 'This' as the keystart and 'Replaced' as the keyend.

Your service BPML configuration should look like:

<operation name="Document Keyword Replace">

<participant name="SyncEngine_DocKeywordReplace"/>
<output message="DocKeywordReplaceInputMessage">
<assign to="literal_mode">false</assign>
<assign to="mode">Document</assign>
<assign to="keyword1">IsTheStringToBe</assign>
<assign to="replace1">/ProcessData/PrimDocSave</assign>
<assign to="keystart">This</assign>
<assign to="keyend">Replaced</assign>
<assign to="." from="*"></assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

Example 2: Using the default keystart and keyend

You will need to make sure that the string to be replaced on your PrimaryDocument contains the default keystart and keyend values.

It should look like: ${ThisIsTheStringToBeReplaced}

Your service BPML configuration should look like:

<operation name="Document Keyword Replace">

<participant name="SyncEngine_DocKeywordReplace"/>
<output message="DocKeywordReplaceInputMessage">
<assign to="literal_mode">false</assign>
<assign to="mode">Document</assign>
<assign to="keyword1">ThisIsTheStringToBeReplaced</assign>
<assign to="replace1">/ProcessData/PrimDocSave</assign>
<assign to="." from="*"></assign>
</output>
<input message="inmsg">
<assign to="." from="*"></assign>
</input>
</operation>

Please don't hesitate to leave a comment if you have interesting usage examples of the document keyword replace service that you would think could be useful to be added to this blog.

[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SS3JSW","label":"IBM Sterling B2B Integrator"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB59","label":"Sustainability Software"}}]

UID

ibm11121871