Collection of statistics on XML objects

You can use separate RUNSTATS control statements to collect statistics on XML table spaces, or on their associated base table spaces.

You can specify that RUNSTATS collects space statistics on an XML table space. You can use those statistics to determine when the XML table space must be reorganized. Statistics that are collected on the XML table space also affect access path selection.

The following tables shows statistics values that are collected for the implicitly created XML DOCID columns. The

Table 1. Statistics that are collected on the XML DOCID column.
Catalog Table Name Column Name Meaning
SYSCOLSTATS COLCARD The number of distinct values in the XML column. This value is set to -2 for other types of XML objects.
SYSCOLSTATS HIGHKEY The highest value of the column. This value is set to blanks for other types of XML objects.
SYSCOLSTATS HIGH2KEY The second highest value of the column. This value is set to blanks for other types of XML objects.
SYSCOLSTATS LOW2KEY The second lowest value of the column. This value is set to blanks for other types of XML objects.
SYSCOLSTATS LOWKEY The lowest value of the column. This value is set to blanks for other types of XML objects.
SYSCOLUMNS COLCARDF Estimated number of distinct values in the column. This value is set to -2 for other types of XML objects.
SYSCOLUMNS HIGH2KEY Second highest value in the column. This value is set to blanks for other types of XML objects.
SYSCOLUMNS LOW2KEY Second lowest value in the column. This value is set to blanks for other types of XML objects.

The following table shows that statistics values that are collected when RUNSTATS is run for XML indexes:

Table 2. Statistics that RUNSTATS collects for XML indexes
Catalog Table Name Column Name Meaning
SYSINDEXES FIRSTKEYCARDF The number of unique key values in the index.
SYSINDEXES FULLKEYCARDF The total number of key entries in the index.
SYSINDEXES FIRSTKEYCARDF The number of documents in the XML column. This value is collected only for implicitly created node ID indexes.
SYSINDEXES CLUSTERRATIOF This value is set to -2.
SYSKEYTARGETS HIGH2KEY For KEYSEQ=1, the second-highest key value.
SYSKEYTARGETS LOW2KEY For KEYSEQ=1, the second-lowest key value.
SYSKEYTARGETS CARDF The number of unique DOCID values in the index.
SYSKEYTARGETS KEYCARDF This value is set to -2.
Start of changeYou can also specify that RUNSTATS collects frequency and histogram statistics for the key values of XML indexes. For example, each of the following example RUNTSTATS control statements specifies that Db2 collects frequency and histogram statistics for XML indexes:
RUNSTATS
	INDEX(ALL) FREQVAL NUMCOLS 1 COUNT 10 MOST
		HISTOGRAM NUMCOLS 1 NUMQUANTITLES 5 
RUNSTATS
	INDEX (xml-index-name) 
FREQVAL NUMCOLS 1 COUNT 10 MOST
		HISTOGRAM NUMCOLS 1 NUMQUANTITLES 5
LISTDEF MYLIST1 INCLUDE TABLESPACES DATABASE MYDB1  

RUNSTATS TABLESPACE LIST MYLIST1 TABLE(ALL) INDEX(ALL)    
         FREQVAL NUMCOLS 1 COUNT 10 MOST                  
         HISTOGRAM NUMCOLS 1 NUMQUANTILES 5 
End of change

Start of changeRUNSTATS collects the following frequency statistics for XML indexes:End of change

Start of change
Table 3. Frequency statistics that RUNSTATS collects for XML indexes
Catalog Table Name Column Name Meaning
SYSKEYTGTDIST KEYVALUE The frequently occurring value in the distribution (most frequent means the key appear the largest percentage in XML value index).
SYSKEYTGTDIST FREQUENCYF The percentage of index entries that contain the value that is identified in the KEYVALUE column.
SYSINDEXES TYPE The type of statistic. 'F' for frequent value
End of change

Start of changeRUNSTATS collects the following histogram statistics for XML indexes:End of change

Start of change
Table 4. Histogram statistics that RUNSTATS collects for XML indexes
Catalog Table Name Column Name Meaning
SYSKEYTGTDIST CARDF The number of distinct values in the column group of the interval that is identified by the value of the QUANTILENO column.
SYSKEYTGTDIST FREQUENCYF The percentage of index entries that contain the value that is identified by the value of the KEYVALUE column.
SYSKEYTGTDIST HIGHVALUE The high bound of the interval.
SYSKEYTGTDIST LOWVALUE The low bound of the interval.
SYSKEYTGTDIST TYPE The type of statistic. 'H' for histogram statistics.
SYSKEYTGTDIST QUANTILENO The ordinary sequence number of the quantile in the whole consecutive value range from low to high.
End of change

XML indexes are related to XML tables, and not to the associated base tables. If you specify a base table space and an XML index in the same RUNSTATS control statement, Db2 generates an error. When you run RUNSTATS against a base table, RUNSTATS collects statistics only for indexes on the base table, including the document ID index.