Analyzing statistics on an existing spatial grid index
can tell you whether the index is efficient, or whether it should
be replaced by a more efficient index.
Before you begin
Before you can analyze the data that you want to index:
About this task
Use the Index Advisor to obtain statistics
on an existing spatial grid indexes. Analyze these statistics and
determine if you should replace any indexes.
Tip: Equally important to tuning
your index is verifying that it is being used by your queries. To
determine if a spatial index is being used, run a command line tool such as db2exfmt on
your query. In the Access Plan
section of the explain output,
if you see an EISCAN operator and the name of your spatial index,
then the query uses your index.
Procedure
To analyze statistics on an existing spatial grid indexes
and determine whether they should be replaced by more efficient indexes:
Obtain statistics on a spatial grid index and, if necessary,
to replace the index:
- Have the Index Advisor collect statistics based on the
grid cell sizes of the existing index. You can ask for statistics
on either a subset of the indexed data or all of the data.
- To obtain statistics on indexed data in a subset of rows, enter
the gseidx command and specify the ANALYZE keyword
and its parameters in addition to the existing-index clause and DETAIL keyword.
You can specify either the number or percentage of rows that the Index
Advisor is to analyze to obtain statistics. For example, to obtain
statistics on a subset of the data indexed by the COUNTIES_SHAPE_IDX
index, enter:
gseidx CONNECT TO mydb USER userID USING password GET GEOMETRY
STATISTICS FOR INDEX userID.counties_shape_idx DETAIL ANALYZE 25 PERCENT
ADVISE
- To obtain statistics on all indexed data, enter the gseidx command
and specify its existing-index clause. Include the DETAIL keyword.
For example, to invoke the Index Advisor for the COUNTIES_SHAPE_IDX
index, enter:
gseidx CONNECT TO mydb USER userID USING password GET GEOMETRY
STATISTICS FOR INDEX userID.counties_shape_idx DETAIL SHOW HISTOGRAM ADVISE
The Index Advisor returns statistics, a histogram of the
data, and recommended cell sizes for the existing index. For example,
the preceding
gseidx command for all data indexed
by COUNTIES_SHAPE_IDX returns the following statistics:
Grid Level 1
------------
Grid Size : 0.5
Number of Geometries : 2936
Number of Index Entries : 12197
Number of occupied Grid Cells : 2922
Index Entry/Geometry ratio : 4.154292
Geometry/Grid Cell ratio : 1.004791
Maximum number of Geometries per Grid Cell: 14
Minimum number of Geometries per Grid Cell: 1
Index Entries : 1 2 3 4 10
--------------- ------ ------ ------ ------ ------
Absolute : 86 564 72 1519 695
Percentage (%): 2.93 19.21 2.45 51.74 23.67
Grid Level 2
------------
Grid Size : 0.0
No geometries indexed on this level.
Grid Level 3
------------
Grid Size : 0.0
No geometries indexed on this level.
Grid Level X
------------
Number of Geometries : 205
Number of Index Entries : 205
- Determine how well the grid cell sizes of the existing
index facilitate retrieval. Assess the statistics returned in the
previous step.
The index statistics obtained in the previous step
for the COUNTIES_SHAPE_IDX indicate that the grid sizes (0.5, 0, 0)
are not appropriate for the data in this column because:
- For Grid Level 1, the
Index Entry/Geometry ratio
value 4.154292
is
greater than the guideline of 4. The Index Entries
line
has the values 1, 2, 3, 4, and 10, which indicates the number of index
entries per geometry. The Absolute
values below each Index
Entries
column indicates the number of geometries that have that
specific number of index entries. For example, the output in the previous
step shows 1519 geometries have 4 index entries. The Absolute
value
for 10 index entries is 695 which indicates that 695 geometries have
between 5 and 10 index entries.
- The appearance of the
Grid Level X
section indicates that
an overflow index level exists. The statistics show that 205 geometries
have more than 10 index entries each.
- If the statistics are not satisfactory, look at the
Histogram section and the appropriate rows in the Query Window Size
and Suggested Grid Sizes columns in the Index Advisor output.
- Find the MBR size with the largest number of geometries. The
Histogram
section
lists the MBR sizes and the number of geometries that have that MBR
size. In the following sample histogram, the largest number of geometries
(437) is in MBR size 0.5.
Histogram:
----------
MBR Size Geometry Count
-------------------- --------------------
0.040000 1
0.045000 3
0.050000 1
0.055000 3
0.060000 3
0.070000 4
0.075000 3
0.080000 4
0.085000 1
0.090000 2
0.095000 1
0.150000 10
0.200000 9
0.250000 15
0.300000 23
0.350000 83
0.400000 156
0.450000 282
0.500000 437
0.550000 397
0.600000 341
0.650000 246
0.700000 201
0.750000 154
0.800000 120
0.850000 66
0.900000 79
0.950000 59
1.000000 47
1.500000 230
2.000000 89
2.500000 34
3.000000 10
3.500000 5
4.000000 3
5.000000 3
5.500000 2
6.000000 2
6.500000 3
7.000000 2
8.000000 1
15.000000 3
25.000000 2
30.000000 1
- Go to the Query Window Size row with the value 0.5 to obtain the
suggested grid sizes (1.4, 3.5, 14.0).
Query Window Size Suggested Grid Sizes Cost
----------------- -------------------------- ----
0.1 0.7, 2.8, 14.0 2.7
0.2 0.7, 2.8, 14.0 2.9
0.5 1.4, 3.5, 14.0 3.5
1 1.4, 3.5, 14.0 4.8
2 1.4, 3.5, 14.0 8.2
5 1.4, 3.5, 14.0 24
10 2.8, 8.4, 21.0 66
20 4.2, 14.7, 37.0 190
50 7.0, 14.0, 70.0 900
100 42.0, 0, 0 2800
- Verify that the recommended sizes meet the guidelines
in step 2. Run the gseidx command with the suggested
grid sizes:
gseidx CONNECT TO mydb USER userID USING password GET GEOMETRY
STATISTICS FOR COLUMN userID.counties(shape) USING GRID SIZES (1.4, 3.5, 14.0)
Grid Level 1
------------
Grid Size : 1.4
Number of Geometries : 3065
Number of Index Entries : 5951
Number of occupied Grid Cells : 513
Index Entry/Geometry ratio : 1.941599
Geometry/Grid Cell ratio : 5.974659
Maximum number of Geometries per Grid Cell: 42
Minimum number of Geometries per Grid Cell: 1
Index Entries : 1 2 3 4 10
--------------- ------ ------ ------ ------ ------
Absolute : 1180 1377 15 493 0
Percentage (%): 38.50 44.93 0.49 16.08 0.00
Grid Level 2
------------
Grid Size : 3.5
Number of Geometries : 61
Number of Index Entries : 143
Number of occupied Grid Cells : 56
Index Entry/Geometry ratio : 2.344262
Geometry/Grid Cell ratio : 1.089286
Maximum number of Geometries per Grid Cell: 10
Minimum number of Geometries per Grid Cell: 1
Index Entries : 1 2 3 4 10
--------------- ------ ------ ------ ------ ------
Absolute : 15 28 0 18 0
Percentage (%): 24.59 45.90 0.00 29.51 0.00
Grid Level 3
------------
Grid Size : 14.0
Number of Geometries : 15
Number of Index Entries : 28
Number of occupied Grid Cells : 9
Index Entry/Geometry ratio : 1.866667
Geometry/Grid Cell ratio : 1.666667
Maximum number of Geometries per Grid Cell: 10
Minimum number of Geometries per Grid Cell: 1
Index Entries : 1 2 3 4 10
--------------- ------ ------ ------ ------ ------
Absolute : 7 5 1 2 0
Percentage (%): 46.67 33.33 6.67 13.33 0.00
The statistics now show values within the guidelines:
- The
Index Entry/Geometry ratio
values are 1.941599 for
Grid Level 1, 2.344262 for Grid Level 2, and 1.866667 for Grid Level
3. These values are all within the guideline value range of 1 to 4.
- The absence of the
Grid Level X
section indicates that
no index entries are in the overflow level.
- Drop the existing index and replace it with an index
that specifies the advised grid sizes. For the sample in the previous
step, run the following DDL statements:
DROP INDEX userID.counties_shape_idx;
CREATE INDEX counties_shape_idx ON userID.counties(shape) EXTEND USING
DB2GSE.SPATIAL_INDEX(1.4,3.5,14.0);