Topic
  • 5 replies
  • Latest Post - ‏2013-12-18T09:13:50Z by MatthiasNicola
AlanDovey
AlanDovey
4 Posts

Pinned topic Using XML Indexes with XMLTABLE

‏2013-11-19T08:33:50Z |

My question is one of performance and trying to use XML indexes instead of shredding to relational tables.

Due to an application restraint, I am creating views that return data from XML documents using XMLTABLE..

Here is very simplyfied DDL.

CREATE TABLE T1 (KEY SMALLINT NOT NULL, XMLDOC XML);

CREATE UNIQUE INDEX I1 ON T1 (KEY);

CREATE TABLE T2 (KEY SMALLINT NOT NULL, XMLDOC XML);

CRETE UNIQUE INDEX I2 ON T2 (KEY);

CREATE VIEW V1 (KEY, USERNAME, ORDERNUMBER) AS (

    WITH R1 AS (Select T1.KEY, X1.* from T1, XMLTABLE('$XMLDOC/Request' COLUMNS UserName VARCHAR(40) PATH 'UserName') X1),

    WITH R2 AS (Select T2.KEY, X2.* from T2, XMLTABLE('$XMLDOC/Request' COLUMNS OrderNumber INTEGER PATH 'OrderNumber') X2)

    Left Outer Join

    Select R1.KEY, R1.Username, R2.OrderNumber from R1 outer join R2 on r1.key = r2.key);

 

 

The application now needs to filter the view on UserName and/or OrderNumber e.g.

select from v1 where username = ?

I would like to add XML indexes to service these queries but reading the Cookbook and testing with explain would suggest that these indexes would not be used without the application using the XMLEXISTS predicate.

Is this true?  Is the optimizer unable to rewrite the query so that XML Indexes can be used?

  • MatthiasNicola
    MatthiasNicola
    321 Posts

    Re: Using XML Indexes with XMLTABLE

    ‏2013-11-28T09:37:43Z  

    >> I would like to add XML indexes to service these queries but reading the Cookbook and testing with explain would suggest that these indexes would not be used without the application using the XMLEXISTS predicate.

     

    Hi Alan,

    what you say is correct for DB2 9.1 and 9.5, but in DB2 9.7 and higher the DB2 query compiler is enhanced so that SQL predicates against most XMLTABLE views can use underlying XML indexes. This is also explained at the top of page 306 in the DB2 pureXML Cookbook.

    Which version of DB2 are you using?

    Maybe you can try this with a simpler view definition first, without the outer join. For example:

    CREATE VIEW V1 (KEY, USERNAME) AS (
    Select T1.KEY, X1.* from T1, XMLTABLE('$XMLDOC/Request' COLUMNS UserName VARCHAR(40) PATH 'UserName') X1;

     

    On this view, the query "select from v1 where username = ?" should use an XML index defined as VARCHAR on $XMLDOC/Request/UserName .

     

    Does that work?

     

    Thx,

     

    - Matthias

     

  • AlanDovey
    AlanDovey
    4 Posts

    Re: Using XML Indexes with XMLTABLE

    ‏2013-11-28T16:43:46Z  

    Hi

    I'm working on V10.1

    I've simplified things down to the following:

    create table ajd1 (xmldoc xml);
    CREATE INDEX IX_AJD1 ON AJD1(XMLDOC) GENERATE KEY USING XMLPATTERN '/AJD1/SID' AS SQL VARCHAR HASHED collect detailed statistics;
    insert into ajd1 select XMLDOCUMENT(XMLELEMENT(NAME "AJD1", XMLELEMENT(NAME "SID", sessionid))) from event_search;
    runstats on table AJD1 on all columns and detailed indexes all;
    create Or replace view ajdv as (select X.* from AJD1, XMLTABLE('$XMLDOC/AJD1' COLUMNS SessionID CHAR(23) PATH 'SID') AS X);

    explain all for select * from ajdv where sessionid = ? gives:

    TBSCAN on AJD1 and Total Cost = 360302

    explain all for select * from ajd1 where xmlexists('$XMLDOC/AJD1[SID = "?"]') gives:

    XISCAN on IX_AJD1 and Total Cost = 291

     

    Confused!
     

  • MatthiasNicola
    MatthiasNicola
    321 Posts

    Re: Using XML Indexes with XMLTABLE

    ‏2013-11-29T14:20:55Z  
    • AlanDovey
    • ‏2013-11-28T16:43:46Z

    Hi

    I'm working on V10.1

    I've simplified things down to the following:

    create table ajd1 (xmldoc xml);
    CREATE INDEX IX_AJD1 ON AJD1(XMLDOC) GENERATE KEY USING XMLPATTERN '/AJD1/SID' AS SQL VARCHAR HASHED collect detailed statistics;
    insert into ajd1 select XMLDOCUMENT(XMLELEMENT(NAME "AJD1", XMLELEMENT(NAME "SID", sessionid))) from event_search;
    runstats on table AJD1 on all columns and detailed indexes all;
    create Or replace view ajdv as (select X.* from AJD1, XMLTABLE('$XMLDOC/AJD1' COLUMNS SessionID CHAR(23) PATH 'SID') AS X);

    explain all for select * from ajdv where sessionid = ? gives:

    TBSCAN on AJD1 and Total Cost = 360302

    explain all for select * from ajd1 where xmlexists('$XMLDOC/AJD1[SID = "?"]') gives:

    XISCAN on IX_AJD1 and Total Cost = 291

     

    Confused!
     

    Hi Alan,

    for the predicate pushdown to work I think the index needs to be defined as VARCHAR(N) instead of VARCHAR HASHED.

    For example:

     

    create table ajd1 (xmldoc xml);

    CREATE INDEX IX_AJD1 ON AJD1(XMLDOC) GENERATE KEY USING XMLPATTERN '/AJD1/SID' AS SQL VARCHAR(23);

    create Or replace view ajdv AS
      (  select X.*
         from AJD1,
              XMLTABLE('$XMLDOC/AJD1'
                          COLUMNS
                             SessionID CHAR(23) PATH 'SID') AS X
       );

    select * from ajdv where sessionid = ?;

     I just ran this test and the query against the view uses the XML index ix_ajd1.

     

    I also tried the following example which you might find useful because the view V2 resembles the original example in your first post:

     

    CREATE TABLE T1 (KEY SMALLINT NOT NULL, XMLDOC XML);
    CREATE TABLE T2 (KEY SMALLINT NOT NULL, XMLDOC XML);

    CREATE INDEX IX_username ON T1(XMLDOC) GENERATE KEY USING XMLPATTERN '/Request/UserName' AS SQL VARCHAR(40);


    CREATE INDEX IX_orderno ON T2(XMLDOC) GENERATE KEY USING XMLPATTERN '/Request/OrderNumber' AS SQL INTEGER;


    CREATE VIEW V1 AS (
       Select R1.KEY, R1.Username, R2.OrderNumber
       from
         (Select T1.KEY, X1.* from T1, XMLTABLE('$XMLDOC/Request' COLUMNS UserName VARCHAR(40) PATH 'UserName') X1) AS R1,
         (Select T2.KEY, X2.* from T2, XMLTABLE('$XMLDOC/Request' COLUMNS OrderNumber INTEGER PATH 'OrderNumber') X2) As R2
       where r1.key = r2.key );
      
      
    CREATE VIEW V2 AS (
       Select KEY1, Username, OrderNumber
       from
         (Select T1.KEY as key1, X1.* from T1, XMLTABLE('$XMLDOC/Request' COLUMNS UserName VARCHAR(40) PATH 'UserName') X1)
         outer join
         (Select T2.KEY as key2, X2.* from T2, XMLTABLE('$XMLDOC/Request' COLUMNS OrderNumber INTEGER PATH 'OrderNumber') X2)
       ON key1 = key2 );

     

    -- Query 1:
    select * from v1 where username = ? and ordernumber = ?; 

    -- Query 2:
    select * from v2 where username = ? and ordernumber = ?; 

     

    Both Query 1 and Query 2 in this example use both XML indexes, as shown in the db2exfmt output like this:

     

                                    Rows
                                   RETURN
                                   (   1)
                                    Cost
                                     I/O
                                     |
                                      0
                                   HSJOIN
                                   (   2)
                                   13.6731
                                      2
                         /-----------+-----------\
                        0                           0
                     NLJOIN                      NLJOIN
                     (   3)                      (   9)
                     6.83657                     6.83657
                        1                           1
                   /---+----\                  /---+----\
                  0      1.86264e-011         0      1.86264e-011
               FETCH        XSCAN          FETCH        XSCAN
               (   4)       (   8)         (  10)       (  14)
              0.0272551     6.80932       0.0272551     6.80932
                  0            1              0            1
            /----+----\                 /----+----\
           0             0             0             0
        RIDSCN    TABLE: MNICOLA    RIDSCN    TABLE: MNICOLA
        (   5)          T1          (  11)          T2
       0.0269542        Q4         0.0269542        Q3
           0                           0
          |                           |
           0                           0
        SORT                        SORT
        (   6)                      (  12)
       0.0265286                   0.0265286
           0                           0
          |                           |
           0                           0
        XISCAN                      XISCAN
        (   7)                      (  13)
       0.0256359                   0.0256359
           0                           0
          |                           |
           0                           0
    XMLIN: MNICOLA              XMLIN: MNICOLA
      IX_USERNAME                 IX_ORDERNO
          Q4                          Q3

     

    If you now also define relational indexes on T1.KEY and T2.KEY then it becomes harder to do these tests with empty or almost empty tables, because cost-based index selection in the access plan requires a meaningful amount of data and statistics.

     

     

    For what it's worth, here is another example on the 2nd view, and you see that access to table T1 is based on index AND'ing between the relational index I1 and the XML index on username:

     

    CREATE UNIQUE INDEX I1 ON T1 (KEY);

    CREATE UNIQUE INDEX I2 ON T2 (KEY);

    select * from v2 where username = ? and ordernumber = ?;

    -->                            Rows
                                 RETURN
                                 (   1)
                                  Cost
                                   I/O
                                   |
                                6.4e-005
                                 NLJOIN
                                 (   2)
                                 61.6449
                                  9.04
                       /-----------+-----------\
                    0.04                       0.0016
                   NLJOIN                      NLJOIN
                   (   3)                      (   9)
                   34.0821                     27.5627
                      5                         4.04
                   /-+--\                      /-+--\
                  1      0.04               0.04     0.04
               FETCH    XSCAN              FETCH    XSCAN
               (   4)   (   8)             (  10)   (  16)
               6.83701  27.2451           0.317599  27.2451
                  1        4                0.04       4
            /----+----\                 /----+----\
           1            25           0.04           25
        RIDSCN    TABLE: MNICOLA    RIDSCN    TABLE: MNICOLA
        (   5)          T2          (  11)          T1
       0.0271083        Q3         0.0449079        Q4
           0                           0
          |                           |
           1                         0.04
        SORT                        SORT
        (   6)                      (  12)
       0.0266827                   0.0444824
           0                           0
          |                           |
           1                         0.04
        XISCAN                      IXAND
        (   7)                      (  13)
       0.0256359                   0.0435853
           0                           0
          |                     /-----+------\
          25                   1                1
    XMLIN: MNICOLA          IXSCAN           XISCAN
      IX_ORDERNO            (  14)           (  15)
          Q3               0.0149079        0.0256359
                               0                0
                              |                |
                              25               25
                        INDEX: MNICOLA   XMLIN: MNICOLA
                              I1           IX_USERNAME

                              Q4               Q4

     

    This access plan might be different with actual data and statistics, but it illustrates that DB2 is able to use regular indexes and XML indexes on the same table at the same time.

    - Matthias

    Updated on 2013-11-29T14:22:03Z at 2013-11-29T14:22:03Z by MatthiasNicola
  • AlanDovey
    AlanDovey
    4 Posts

    Re: Using XML Indexes with XMLTABLE

    ‏2013-12-17T18:01:42Z  

    Hi Matthias

    Not using VARCHAR HASHED certainly helps and I have been able to progress nicely.

    I am also using Range Partitioning on my tables and have noticed that if I create my XML indexes as PARTITIONED then they do not seem to be considered for access. Is this what you would expect?

    Of course I can create as NOT PARTITIONED but that then affects my rollout performance.

     

  • MatthiasNicola
    MatthiasNicola
    321 Posts

    Re: Using XML Indexes with XMLTABLE

    ‏2013-12-18T09:13:50Z  
    • AlanDovey
    • ‏2013-12-17T18:01:42Z

    Hi Matthias

    Not using VARCHAR HASHED certainly helps and I have been able to progress nicely.

    I am also using Range Partitioning on my tables and have noticed that if I create my XML indexes as PARTITIONED then they do not seem to be considered for access. Is this what you would expect?

    Of course I can create as NOT PARTITIONED but that then affects my rollout performance.

     

    Hi Alan,

     

    glad to hear you were able to make progress.

    Defining XML indexes as partitioned indexes should generally not impact they eligibility for access. 

    If you find that PARTITIONED vs NOT PARTITIONED makes the difference for whether an XML index is being used, and everything is else is equal, then this sounds like a bug. In that case, please open a PMR.

     

    Thanks,

     

    Matthias