Topic
2 replies Latest Post - ‏2013-05-20T17:10:01Z by Sigmazen
Sigmazen
Sigmazen
19 Posts
ACCEPTED ANSWER

Pinned topic XML index - To create or not to create

‏2013-05-14T20:05:58Z |

Hi there.  Apologies if this is a basic question but wanted to confirm the best approach for the following query (which has been simplified for purposes of example).  I am creating xmltables (T1 and T2) from Table E.  I have an index on E.MAIN-ID and therefore pull the required rows out from the data-pages I need.  Assume 50 rows for customer and 50 for card.

For the first part of the query I just need the Customer portion, but the second part I need the Card portion (again ignore the reasoning behind this for now).

The question is this ... if there is no xml-index on TargetingLvl then my assumption is that the DB engine pulls the customer *and* card rows into the bufferpool so the second part of the query does not need to make another data-page call since the data is already in the BP.

Whereas if I create an xml-index (or regular index if I tweaked the data) on TargetingLvl it first retrieves the customer data, does that query, then goes back via the index to the data-page to retrieve the card data.

My assumption is that using an index in this instance would be less efficient?

Does this sound correct?

SELECT T2.ATTRIBUTE_01 AS ATTRIBUTE
  FROM MY_TABLE AS E
     , XMLTABLE('$xml/Treatment' passing E.XML_TX1 as "xml"
        COLUMNS CONTENT_NM VARCHAR(40) PATH 'Content'
              , TARGET_LEVEL VARCHAR(40) PATH 'TargetingLvl'
     ) AS T1
     , XMLTABLE('$xml/Targeting' passing E.XML_TX2 as "xml"
        COLUMNS ATTRIBUTE_01 VARCHAR(40) PATH 'Product'
     ) AS T2
 WHERE E.MAIN_ID = 'Web'
   AND T1.TARGET_LEVEL = 'Customer'
UNION ALL

 

SELECT T2.ATTRIBUTE_02 AS ATTRIBUTE
  FROM MY_TABLE AS E
     , XMLTABLE('$xml/Treatment' passing E.XML_TX1 as "xml"
        COLUMNS CONTENT_NM VARCHAR(40) PATH 'Content'
              , TARGET_LEVEL VARCHAR(40) PATH 'TargetingLvl'
     ) AS T1
     , XMLTABLE('$xml/Targeting' passing E.XML_TX2 as "xml"
        COLUMNS ATTRIBUTE_02 VARCHAR(40) PATH 'Indicator'
     ) AS T2
 WHERE E.MAIN_ID = 'Web'
   AND T1.TARGET_LEVEL = 'Card'

Thanks in advance

Simon

  • MatthiasNicola
    MatthiasNicola
    309 Posts
    ACCEPTED ANSWER

    Re: XML index - To create or not to create

    ‏2013-05-15T23:59:08Z  in response to Sigmazen

    Hi Simon,

    if you only have an index on E.MAIN-ID but no other indexes (esp. no XML indexes) then DB2 will:

    • Fetch all rows for E.MAIN_ID = 'Web' into the bufferpoolvia that index. Let's say that's 100 documents. 
    • Then it will navigate these 100 documents to find those documents where the element TargetingLvl = 'Customer' and extract what's needed.
    • Then for the second leg of your UNION ALL it will access the index on E.MAIN-ID again to get the same 100 documents and will most likely find that they are all in the BP already.
    • Then it will navigate these 100 documents again find those documents where the element TargetingLvl = 'Card' and extract what's needed.

    If you add an XML index on TargetingLvl,  then DB2 has the ability to fetch the documents for "Customer" and "Card" separately, and only those.

    If all documents for E.MAIN_ID = 'Web' have either TargetingLvl = 'Card' or TargetingLvl = 'Card' then the additional XML index will not reduce the I/O in this specific case. However, it would reduce the amount of XML navigation in each of the two subqueries, which might be a moderate performance advantage.

    As a different case, let's assume you have 100 documents for for E.MAIN_ID = 'Web', and let's say only 5 of them have TargetingLvl = 'Card' and another 5 have TargetingLvl = 'Card' and 90 have a different TargetingLvl that is not needed for this query, then the additional XML index can be very beneficial for performance because it means you would fetch 90% fewer documents in the buffer pool.

     - - -

    On a seperate note, maybe you don't need the UNION ALL at all. How about combining into a single SELECT with a CASE expression, as in the following example (not tested, just an idea):

    SELECT CASE WHEN T1.TARGET_LEVEL = 'Customer' THEN T2.ATT_01
                WHEN T1.TARGET_LEVEL = 'Card'     THEN T2.ATT_02
            END   
      FROM MY_TABLE AS E
         , XMLTABLE('$xml/Treatment' passing E.XML_TX1 as "xml"
            COLUMNS CONTENT_NM VARCHAR(40)   PATH 'Content'
                  , TARGET_LEVEL VARCHAR(40) PATH 'TargetingLvl'
         ) AS T1
         , XMLTABLE('$xml/Targeting' passing E.XML_TX2 as "xml"
            COLUMNS ATT_01 VARCHAR(40) PATH 'Product',
            COLUMNS ATT_02 VARCHAR(40) PATH 'Indicator'               
         ) AS T2
     WHERE E.MAIN_ID = 'Web'
     AND XMLEXISTS('$XML_TX1/Treatment[TargetingLvl = ("Customer","Card")]')
    ;

    What do you think?

    - Matthias

    • Sigmazen
      Sigmazen
      19 Posts
      ACCEPTED ANSWER

      Re: XML index - To create or not to create

      ‏2013-05-20T17:10:01Z  in response to MatthiasNicola

      Hi Matthias.

      Thanks for a great explanation.

      In terms of the xml we store, the UI to create it will only allow Customer or Card, so therefore it's either 40/60 50/50 or even 100/0 ... if we get to include other values then we'll definitely include an xmlindex.

      So in summary it looks as thought it will be less IO to *not* have an xmlindex because the records will already be in the BP (which I guess is similar in concept to not indexing a tiny lookup table which takes up less than a page of data).

      The second point is interesting ... the SQL I attached was vastly simplified but I'll have a look at whether the concept works (especially now I'm using some common-table-expressions to simplify the query still further).

      We're still seeing significant cpu increases converting from xmlquery to xmltable though :-(   But they'll be raised through the regular support channels.

      Thanks again for your continued help.

      Cheers

      Simon