IBM Support

Direct I/O and XML Query Performance

Technical Blog Post


Abstract

Direct I/O and XML Query Performance

Body

   When assessing performance for queries,  particularly when performance is insufficient or when doing query performance tuning,  assessing both CPU utilization and bufferpool I/O (especially data,index, and temporary physical reads)  via snapshots or monitor routines are essential metrics which are fundamental to identifying what areas of the query processing are of greatest importance.  When running queries which involve XML data,  direct I/O can also be an important factor in query performance

 

  Queries which have XML operations such as XMLquery predicates,  or which fetch XML column data (for example via XML2CLOB) may end up doing temporary processing which involves direct I/O during query execution.  As a simple example of a single table select which includes XML constructs:  

 

SELECT C1, C2...

       XML2CLOB(my_xml_column) as xml_data

FROM IMAIONE.T1

WHERE <other local predicates> AND

      XMLcast(XMLquery('$d/node1/node2/node3/node4' passing  T1.my_xml_column as "d") as varchar(25)) is null AND
      XMLcast(XMLquery('$d/node1/node2/node3/node5' passing T1.my_xml_column as "d") as varchar(25)) = C4

         

  This query is assumed to contain some regular (non-XML) predicates on T1,  as well as two XMLQUERY predicates which have to be applied on the XML data (in column MY_XML_COLUMN).  The query also selects data from the XML column using the XML2CLOB scalar function.  An explain plan for this query might look similar to the following: 

                250
               NLJOIN
               (   2)
               340.973
               45.0008
               /-+--\
            1000    0.25
           FETCH    XSCAN                                                      
           (   3)   (   5)                                                    
           22.7255  318.247                                                   
           3.00078    42                                                      
         /---+---\                                                            
      1000      100000                                                   
     IXSCAN  TABLE: IMAIONE                                                      
     (   4)        T1                                                      
     15.152        Q4                                                         
        2                                                                    
       |                                                                      
     100000                                                               
 INDEX: IMAIONE                                                                  
       IX5                                                                 
       Q4         
                                                            

 

  The query plan includes an XSCAN operator which corresponds to the fetching of the XML column data to be returned via XMLCLOB, as well as the application of the XMLquery based predicates.   When monitoring this type of query,  check for direct I/O activity associated with the application handle and ensure that the I/O response times look adequate.  Note also that these direct I/Os would generally be associated with temporary tablespace activity,  although there is no explicit temp operation indicated in the explain graph: 

            Application Snapshot


Application handle                         = 32693

Direct reads                               = 1167414
Direct writes                              = 1167414
Direct read requests                       = 25334
Direct write requests                      = 46870
Direct reads elapsed time (ms)             = 9574
Direct write elapsed time (ms)             = 7002210

 

             Tablespace Snapshot

 

Tablespace name                            = TEMPSPACE1
  Tablespace ID                            = 1

 

  Direct reads                             = 9251864
  Direct writes                            = 9270058
  Direct read requests                     = 402363
  Direct write requests                    = 259955
  Direct reads elapsed time (ms)           = 21822
  Direct write elapsed time (ms)           = 11263002

 

  In this example,  although the average direct read time indicated by the snapshots looks fast,  the direct write times (average 7002210 ms/1167414 = ~5ms/write) which is a very high number for write I/O.  This would indicate a potential issue in the I/O subsystem which would likely create a serious bottleneck for the query's performance. 

 

  In addition to ensuring that direct I/O performance of the I/O subsystem is adequate,  in some cases it may be possible to reduce the amount of direct I/O incurred by the query by creating appropriate XML indexes, if applicable.  Note that using XML indexes may not always prevent such an XSCAN however.  Depending on the type of XQuery predicates specified by the query,  this may restrict whether an XML index can be used to apply those predicates.  Also,  if the query selects the full XML column data, as this one does,  an XSCAN may still be required to fetch the data. 

     

 

 

 

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

UID

ibm11140700