XML format
For the rule scoring apply function, item sets and sequences must be encoded as XML strings.
XML string for an item set
The XML string for an item set looks like this:
<itemset>
<item>...</item>
<item>...</item>
...
</itemset>
Example: XML string of an item set
The item
set {cheese, milk, butter} is encoded in the following
XML string:
<itemset>
<item>cheese</item>
<item>milk</item>
<item>butter</item>
</itemset>
XML string for a sequence
The XML-format for a sequence looks like this:
<sequence>
<itemset group="...">
<item>...</item>
<item>...</item>
...
<itemset>
...
<itemset group="...">
<item>...</item>
<item>...</item>
...
<itemset>
</sequence>
Example: XML string of a sequence
The sort order of the item sets is determined by the values of the group attribute.
The
sequence {cheese, milk}>>>{bread, butter} is
encoded in the following XML string:
<sequence>
<itemset group="2005-08-12">
<item>cheese</item>
<item>milk</item>
<itemset>
<itemset group="2005-08-13">
<item>bread</item>
<item>butter</item>
<itemset>
</sequence>
- 2005-08-12
- is the purchase date of the items cheese and milk
- 2005-08-13
- is the purchase date of the items bread and butter
Building the XML strings
If you use Db2 Version 8.2 or higher, you can build the XML strings by using one of the following functions:
- XMLELEMENT
- XMLCONCAT
- XMLSERIALIZE
If
you use XMLSERIALIZE to build the XML string for the item
set {cheese, milk, butter}, the expression looks
like this:
XMLSERIALIZE(CONTENT
XMLELEMENT(NAME "itemset",
XMLCONCAT(
XMLELEMENT(NAME "item", 'cheese'),
XMLELEMENT(NAME "item", 'milk'),
XMLELEMENT(NAME "item", 'butter')))
AS CLOB(512000))
If you use Db2 Version 8.1, you can use XML2CLOB instead of XML serialize.
If you use XML2CLOB, this expression looks like this:
cast(XML2CLOB(
XMLELEMENT(NAME "itemset",
XMLCONCAT(
XMLELEMENT(NAME "item", 'cheese'),
XMLELEMENT(NAME "item", 'milk'),
XMLELEMENT(NAME "item", 'butter'))))
AS CLOB(512000))
SELECT "TRANSID",
XMLSERIALIZE(CONTENT
XMLELEMENT(NAME "itemset",
XMLAGG(XMLELEMENT(NAME
"item", "ITEMID")))
AS CLOB(512000))
FROM "RETAIL_SCORING” GROUP BY "TRANSID"
{cheese, milk}>>>{bread,
butter}, you can use the following calls of XML functions
to build the XML string:XMLSERIALIZE(CONTENT
XMLELEMENT(NAME "sequence",
XMLCONCAT(
XMLELEMENT(NAME "itemset",
XMLATTRIBUTES('2005-08-12' AS "group"),
XMLCONCAT(
XMLELEMENT(NAME "item", 'cheese'),
XMLELEMENT(NAME "item", 'milk')
)),
XMLELEMENT(NAME "itemset",
XMLATTRIBUTES('2005-08-13' AS "group"),
XMLCONCAT(
XMLELEMENT(NAME "item", 'bread'),
XMLELEMENT(NAME "item", 'butter')
))
))
AS CLOB(512000))
WITH "TRANSACTIONS" ("CUSTOMER_ID", "TRANSDATE", "ITEMSET") AS (
SELECT "CUSTOMER_ID", "TRANSDATE",
XMLELEMENT(NAME "itemset",
XMLATTRIBUTES("TRANSDATE" AS "group"),
XMLAGG(XMLELEMENT(NAME "item", "ITEMID")))
FROM "RETAIL_SCORING"
GROUP BY "CUSTOMER_ID", "TRANSDATE"
)
SELECT
“CUSTOMER_ID”,
XMLSERIALIZE( CONTENT XMLELEMENT(NAME "sequence",
XMLAGG( "ITEMSET" ))
AS CLOB(512000)) AS "SEQUENCE"
FROM “TRANSACTIONS”
GROUP BY “CUSTOMER_ID”