Analyzing spatial grid index statistics

Analyzing statistics on an existing spatial grid index can tell you whether the index is efficient, or whether it should be replaced by a more efficient index.

Before you begin

Before you can analyze the data that you want to index:

  • Your user ID must hold the SELECT privilege on this table.
  • If your table has more than one million rows, you might want to use the ANALYZE clause to analyze a subset of the rows to have reasonable processing time. You must have a USER TEMPORARY table space available to use the ANALYZE clause. Set the page size of this table space to at least 8 KB and ensure that you have USE privileges on it. For example, the following DDL statements create a buffer pool with the same page size as the user temporary table space and grant the USE privilege to anyone:
    CREATE BUFFERPOOL bp8k  SIZE 1000 PAGESIZE 8 K; 
    CREATE USER TEMPORARY TABLESPACE usertempts
       PAGESIZE 8K
       MANAGED BY SYSTEM USING ('c:\tempts')
       BUFFERPOOL bp8k
    GRANT USE OF TABLESPACE usertempts TO PUBLIC;
    

About this task

Use the Index Advisor to obtain statistics on an existing spatial grid indexes. Analyze these statistics and determine if you should replace any indexes.

Tip: Equally important to tuning your index is verifying that it is being used by your queries. To determine if a spatial index is being used, run a command line tool such as db2exfmt on your query. In the Access Plan section of the explain output, if you see an EISCAN operator and the name of your spatial index, then the query uses your index.

Procedure

To analyze statistics on an existing spatial grid indexes and determine whether they should be replaced by more efficient indexes:

Obtain statistics on a spatial grid index and, if necessary, to replace the index:
  1. Have the Index Advisor collect statistics based on the grid cell sizes of the existing index. You can ask for statistics on either a subset of the indexed data or all of the data.
    • To obtain statistics on indexed data in a subset of rows, enter the gseidx command and specify the ANALYZE keyword and its parameters in addition to the existing-index clause and DETAIL keyword. You can specify either the number or percentage of rows that the Index Advisor is to analyze to obtain statistics. For example, to obtain statistics on a subset of the data indexed by the COUNTIES_SHAPE_IDX index, enter:
      gseidx CONNECT TO mydb USER userID USING password GET GEOMETRY 
      STATISTICS FOR INDEX userID.counties_shape_idx DETAIL ANALYZE 25 PERCENT
       ADVISE
      
    • To obtain statistics on all indexed data, enter the gseidx command and specify its existing-index clause. Include the DETAIL keyword. For example, to invoke the Index Advisor for the COUNTIES_SHAPE_IDX index, enter:
      
      gseidx CONNECT TO mydb USER userID USING password GET GEOMETRY 
      STATISTICS FOR INDEX userID.counties_shape_idx DETAIL SHOW HISTOGRAM ADVISE
      
    The Index Advisor returns statistics, a histogram of the data, and recommended cell sizes for the existing index. For example, the preceding gseidx command for all data indexed by COUNTIES_SHAPE_IDX returns the following statistics:
    Grid Level 1
    ------------
    
    Grid Size                     : 0.5
    Number of Geometries          : 2936
    Number of Index Entries       : 12197
    
    Number of occupied Grid Cells : 2922
    Index Entry/Geometry ratio    : 4.154292
    Geometry/Grid Cell ratio      : 1.004791
    Maximum number of Geometries per Grid Cell: 14
    Minimum number of Geometries per Grid Cell: 1
    
    Index Entries :  1      2      3      4      10
    ---------------  ------ ------ ------ ------ ------
    Absolute      :  86     564    72     1519   695
    Percentage (%):  2.93   19.21  2.45   51.74  23.67
    
    
    
    Grid Level 2
    ------------
    
    Grid Size                     : 0.0
    No geometries indexed on this level.
    
    
    Grid Level 3
    ------------
    
    Grid Size                     : 0.0
    No geometries indexed on this level.
    
    
    Grid Level X
    ------------
    
    Number of Geometries          : 205
    Number of Index Entries       : 205
    
  2. Determine how well the grid cell sizes of the existing index facilitate retrieval. Assess the statistics returned in the previous step.
    Tip:
    • The statistic Index Entry/Geometry ratio should be a value in the range of 1 to 4, preferably values closer to 1.
    • The number of index entries per geometry should be less that 10 at the largest grid size to avoid the overflow level.

      The appearance of the Grid Level X section in the Index Advisor output indicates that an overflow level exists.

    The index statistics obtained in the previous step for the COUNTIES_SHAPE_IDX indicate that the grid sizes (0.5, 0, 0) are not appropriate for the data in this column because:
    • For Grid Level 1, the Index Entry/Geometry ratiovalue 4.154292 is greater than the guideline of 4.

      The Index Entries line has the values 1, 2, 3, 4, and 10, which indicates the number of index entries per geometry. The Absolute values below each Index Entries column indicates the number of geometries that have that specific number of index entries. For example, the output in the previous step shows 1519 geometries have 4 index entries. The Absolute value for 10 index entries is 695 which indicates that 695 geometries have between 5 and 10 index entries.

    • The appearance of the Grid Level X section indicates that an overflow index level exists. The statistics show that 205 geometries have more than 10 index entries each.
  3. If the statistics are not satisfactory, look at the Histogram section and the appropriate rows in the Query Window Size and Suggested Grid Sizes columns in the Index Advisor output.
    1. Find the MBR size with the largest number of geometries. The Histogram section lists the MBR sizes and the number of geometries that have that MBR size. In the following sample histogram, the largest number of geometries (437) is in MBR size 0.5.
      
      Histogram:
      ----------
          MBR Size             Geometry Count
          -------------------- --------------------
                      0.040000                    1
                      0.045000                    3
                      0.050000                    1
                      0.055000                    3
                      0.060000                    3
                      0.070000                    4
                      0.075000                    3
                      0.080000                    4
                      0.085000                    1
                      0.090000                    2
                      0.095000                    1
                      0.150000                   10
                      0.200000                    9
                      0.250000                   15
                      0.300000                   23
                      0.350000                   83
                      0.400000                  156
                      0.450000                  282
                      0.500000                  437
                      0.550000                  397
                      0.600000                  341
                      0.650000                  246
                      0.700000                  201
                      0.750000                  154
                      0.800000                  120
                      0.850000                   66
                      0.900000                   79
                      0.950000                   59
                      1.000000                   47
                      1.500000                  230
                      2.000000                   89
                      2.500000                   34
                      3.000000                   10
                      3.500000                    5
                      4.000000                    3
                      5.000000                    3
                      5.500000                    2
                      6.000000                    2
                      6.500000                    3
                      7.000000                    2
                      8.000000                    1
                     15.000000                    3
                     25.000000                    2
                     30.000000                    1
      
    2. Go to the Query Window Size row with the value 0.5 to obtain the suggested grid sizes (1.4, 3.5, 14.0).
      
      Query Window Size  Suggested Grid Sizes             Cost
      -----------------  --------------------------       ----
             0.1          0.7,       2.8,      14.0        2.7
             0.2          0.7,       2.8,      14.0        2.9
             0.5          1.4,       3.5,      14.0        3.5
               1          1.4,       3.5,      14.0        4.8
               2          1.4,       3.5,      14.0        8.2
               5          1.4,       3.5,      14.0         24
              10          2.8,       8.4,      21.0         66
              20          4.2,      14.7,      37.0        190
              50          7.0,      14.0,      70.0        900
             100         42.0,         0,         0       2800
      
  4. Verify that the recommended sizes meet the guidelines in step 2. Run the gseidx command with the suggested grid sizes:
    gseidx CONNECT TO mydb USER userID USING  password GET GEOMETRY
     STATISTICS FOR COLUMN userID.counties(shape) USING GRID SIZES (1.4, 3.5, 14.0)
    
    Grid Level 1
    ------------
    
    Grid Size                     : 1.4
    Number of Geometries          : 3065
    Number of Index Entries       : 5951
    
    Number of occupied Grid Cells : 513
    Index Entry/Geometry ratio    : 1.941599
    Geometry/Grid Cell ratio      : 5.974659
    Maximum number of Geometries per Grid Cell: 42
    Minimum number of Geometries per Grid Cell: 1
    
    Index Entries :  1      2      3      4      10
    ---------------  ------ ------ ------ ------ ------
    Absolute      :  1180   1377   15     493    0
    Percentage (%):  38.50  44.93  0.49   16.08  0.00
    
    
    Grid Level 2
    ------------
    
    Grid Size                     : 3.5
    Number of Geometries          : 61
    Number of Index Entries       : 143
    
    Number of occupied Grid Cells : 56
    Index Entry/Geometry ratio    : 2.344262
    Geometry/Grid Cell ratio      : 1.089286
    Maximum number of Geometries per Grid Cell: 10
    Minimum number of Geometries per Grid Cell: 1
    
    Index Entries :  1      2      3      4      10
    ---------------  ------ ------ ------ ------ ------
    Absolute      :  15     28     0      18     0
    Percentage (%):  24.59  45.90  0.00   29.51  0.00
    
    
    Grid Level 3
    ------------
    
    Grid Size                     : 14.0
    Number of Geometries          : 15
    Number of Index Entries       : 28
    
    Number of occupied Grid Cells : 9
    Index Entry/Geometry ratio    : 1.866667
    Geometry/Grid Cell ratio      : 1.666667
    Maximum number of Geometries per Grid Cell: 10
    Minimum number of Geometries per Grid Cell: 1
    
    Index Entries :  1      2      3      4      10
    ---------------  ------ ------ ------ ------ ------
    Absolute      :  7      5      1      2      0
    Percentage (%):  46.67  33.33  6.67   13.33  0.00
    
    The statistics now show values within the guidelines:
    • The Index Entry/Geometry ratio values are 1.941599 for Grid Level 1, 2.344262 for Grid Level 2, and 1.866667 for Grid Level 3. These values are all within the guideline value range of 1 to 4.
    • The absence of the Grid Level X section indicates that no index entries are in the overflow level.
  5. Drop the existing index and replace it with an index that specifies the advised grid sizes. For the sample in the previous step, run the following DDL statements:
    DROP INDEX userID.counties_shape_idx;
    CREATE INDEX counties_shape_idx ON userID.counties(shape) EXTEND USING
    		DB2GSE.SPATIAL_INDEX(1.4,3.5,14.0);