Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

developerWorks Community:

  • Close [x]

Manage spatial data with IBM DB2 Spatial Extender, Part 1: Acquiring spatial data and developing applications

Tips and techniques for developing efficient spatial applications

David Adler (dadler@us.ibm.com), Senior Software Engineer, IBM China
David Adler
David Adler has been responsible for the development of spatial database technology in IBM for over 20 years, the past 10 years in DB2 Spatial Extender Development.

Summary:  This tutorial series describes common tasks to manage spatial data with IBM DB2® Spatial Extender, including importing and creating spatial data, constructing and executing spatial queries, working with IBM, third party, and open source spatial tools, tuning performance, and considering special circumstances in a data warehouse environment. In this first article in the series, learn how to acquire spatial data and build applications. Learn how to use shapefiles, spatial data tables, and spatial indexes.

View more content in this series

Date:  16 Feb 2012
Level:  Intermediate PDF:  A4 and Letter (862 KB | 30 pages)Get Adobe® Reader®

Comments:  

Working with spatial indexes

In most database applications it is important to define indexes on columns in order to avoid scanning all the rows of a table and checking each row to see if it meets the criteria of a query. This is especially important with spatial queries, because the spatial checking is more complex than checking simple numeric or character values.

In a relational database, an index is typically created on the values in a column where there is a simple linear sequencing of values from smallest to largest. This allows a binary tree or B-tree index to be created, which allows the database to quickly navigate to a particular value or sequential range of values.

Spatial data is inherently two-dimensional (or greater), which is not directly applicable to a B-tree index, so different techniques have been developed. The approach used in Spatial Extender is a grid index, which associates spatial values with a square grid. Line and polygon spatial values may intersect more than one grid cell, while a point spatial value can intersect at most one grid cell. When defining the grid cell sizes, you need to trade off the need to minimize the number of cells that intersect a spatial value against the need to minimize the number of spatial values contained within a grid cell. In order to efficiently handle spatial values with significantly different sizes, you can specify up to three grid cell sizes. The grid cells are then indexed by Spatial Extender using the DB2 B-tree index.

A spatial index is created with the DB2 create index command with an additional extend using clause, as shown in Listing 9.


Listing 9. Using the create index command
create index test.countiesidx on test.counties(shape)
extend using db2gse.spatial_index(1.2, 3.0, 14.0);

The units to specify the grid cell sizes are the same units as the coordinate system, which are degrees of latitude and longitude for the counties data. As you might expect, the counties have quite a range of sizes, so it makes sense to specify different grid cell sizes that best fit the different county sizes.

A degree of latitude is approximately 70 miles or 110 kilometers. A degree of longitude varies from approximately 70 miles at the equator to about 45 miles or 72 kilometers at 50 degrees of latitude.

Determining spatial index grid cell sizes

Determining an optimal grid cell size or grid cell sizes may not be obvious. In general, for line and polygon spatial values, the grid cell size should be a little larger than the size that would hold an average size line or polygon. For point spatial values, the grid cell size should be about 1/10 of the width of a typical query region.

In order to simplify this determination, two tools are provided that analyze the spatial values in a table and provide recommended grid cell sizes to use when creating the index. Don't worry about getting this precisely correct; the performance will be good for a fairly wide variation in the index parameters.

Using the gseidx index advisor tool

The gseidx command-line tool is provided with the installation of Spatial Extender and is available on all of the supported platforms. Significant information about spatial indexes and the details of gseidx is provided in the DB2 Infocenter in the section Using indexes and views to access spatial data.

Listing 10 shows a simple example of using gseidx.


Listing 10. Contents of advise_counties.sql
!gseidx "connect to sample
  get geometry statistics
  for column test.counties(shape)
  advise";

This command can be executed using the command db2 -tvf advise_counties.sql and will result in the output shown in Listing 11.


Listing 11. Output of db2 -tvf advise_counties.sql command
Number of Rows: 3141
Number of non-empty Geometries: 3141
Number of empty Geometries: 0
Number of null values: 0

Extent covered by data:
    Minimum X: -178.217598
    Maximum X: -66.969271
    Minimum Y: 18.921786
    Maximum Y: 71.406235

Query Window Size:     Suggested Grid Sizes:           Index Entry Cost:
--------------------   -----------------------------   ----------------------
      0.01:              1.2,         3,        14             4.6
      0.02:              1.2,         3,        14             4.6
      0.05:              1.2,         3,        14             4.7
       0.1:              1.2,         3,        14             4.9
       0.2:              1.2,         3,        14             5.3
       0.5:              1.2,         3,        14             6.6
         1:              1.2,         3,        14             9.3
         2:              1.2,         3,        14              16
         5:              1.8,       3.6,        13              49
        10:              2.9,       8.7,        26             130
        20:              4.6,        14,        49             390
        50:               12,        72,         0            1700

The first two sections of Listing 11 provide statistics about the spatial data that was analyzed. The middle three columns of the last section provide suggested grid sizes to use in creating a spatial index.

The first column contains values that correspond to typical query window sizes. In the case of an application that is displaying a map, this is the typical width of the region that would be displayed. For example, if a user would most commonly display a map that was 10 miles across, this would correspond to about 0.25 degrees. Looking at the table, you can see that the suggested grid sizes are the same for query window sizes of 0.2 and 0.5 degrees, so you would use (1.2, 3, 14) to create the index.

The last column isn't particularly important. It is an estimate of the number of index entries that would be referenced in satisfying a query.

Java index advisor

You can also download a Java-based spatial index advisor from the Spatial Extender web site. This tool provides a user interface to select a spatial table and column to analyze. See Resources for where to obtain this tool.

An example of the user interface for the Java index advisor is shown in Figure 1.


Figure 1. Java index advisor user interface
Index Advisor

The algorithm used in this index advisor varies from the gseidx tool and results in different recommendations. However, the suggested values should work as well. The cost value is the estimated number of index pages that would be referenced, not the number of index entries as reported by the gseidx tool.

5 of 11 | Previous | Next

Comments



static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=793988
TutorialTitle=Manage spatial data with IBM DB2 Spatial Extender, Part 1: Acquiring spatial data and developing applications
publish-date=02162012
author1-email=dadler@us.ibm.com
author1-email-cc=