Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

developerWorks Community:

  • Close [x]

Manage spatial data with IBM DB2 Spatial Extender, Part 1: Acquiring spatial data and developing applications

Tips and techniques for developing efficient spatial applications

David Adler (dadler@us.ibm.com), Senior Software Engineer, IBM China
David Adler
David Adler has been responsible for the development of spatial database technology in IBM for over 20 years, the past 10 years in DB2 Spatial Extender Development.

Summary:  This tutorial series describes common tasks to manage spatial data with IBM DB2® Spatial Extender, including importing and creating spatial data, constructing and executing spatial queries, working with IBM, third party, and open source spatial tools, tuning performance, and considering special circumstances in a data warehouse environment. In this first article in the series, learn how to acquire spatial data and build applications. Learn how to use shapefiles, spatial data tables, and spatial indexes.

View more content in this series

Date:  16 Feb 2012
Level:  Intermediate PDF:  A4 and Letter (862 KB | 30 pages)Get Adobe® Reader®

Comments:  

Importing from shapefiles

Obtaining shapefiles

The de-facto standard for storing and exchanging spatial data in a file format is the shapefile representation created by Esri. Esri provides shapefiles of sample data to download and use with Spatial Extender (see Resources), although the shapefile needed for this tutorial is included in Downloads for your convenience. Esri shapefiles represent the following types of data:

  • Country boundaries
  • State boundaries
  • County boundaries
  • Zipcode boundaries and centroids
  • Major highways
  • Cities, airports, hospitals, schools
  • Rivers

You can download many other shapefiles from other sources. There is no charge to use some shapefiles, while some require a license fee.

A shapefile is actually a collection of related files that have the same name but different file extensions. The most common files, and the ones used by Spatial Extender, include the following file types:

<name>.shp
This is a required binary file that contains a header describing the type of data (points, lines, polygons, and other geometries), the number of records, and the actual coordinate data.
<name>.dbf
This is a required binary file that contains the character and numeric attributes associated with each record. This actually uses the DBASE format and can be viewed with applications that support this format. There are significant limitations with this format. The attribute names are limited to 11 characters in length, and character data can be no longer than 255 bytes.
<name>.prj
This is an optional text file that contains the definition of the coordinate system associated with the spatial data. The coordinate system in the .prj file must match the coordinate system of the SRS of the column into which the data is being loaded. If this file is not present, Spatial Extender assumes that the data is in the coordinate system of the specified SRS.
<name>.shx
This is an optional binary file that provides an index into the .shp file. The main use of this file with Spatial Extender is to start importing at a record other than the very first record.

You aren't required to know the details of these files, but if you are interested, follow the link to the Esri specification in the Resources section.


Getting details about shapefiles

You can use the db2se shape_info Spatial Extender command to get information about a particular shapefile, including the type geometries, the geographic extents of the data, and all of the attribute names and types. Listing 2 gives an example of the command used with the sample counties shapefile provided with this tutorial (see Downloads). The list of attributes in Listing 2 has been abbreviated to save space.


Listing 2. Shapefile attributes
db2se shape_info -fileName counties.shp

Shape file information
----------------------
File code                  = 9994
File length (16-bit words) = 758872
Shape file version         = 1000
Shape type                 = 5 (ST_MULTIPOLYGON)
Number of records          = 3141

Minimum X coordinate = -178.217598
Maximum X coordinate = -66.969271
Minimum Y coordinate = 18.921786
Maximum Y coordinate = 71.406235

Shapes do not have Z coordinates.
Shapes do not have M coordinates.
Shape index file (extension .shx) is present.

Attribute file information
--------------------------
dBase file code                = 3
Date of last update            = 2002-02-04
Number of records              = 3141
Number of bytes in header      = 1569
Number of bytes in each record = 461
Number of columns              = 48

Column Number  Column Name      Data Type       Length  Decimal
-------------  ---------------  --------------  ------  -------
            1  NAME             C ( Character)      32        0
            2  STATE_NAME       C ( Character)      25        0
            3  STATE_FIPS       C ( Character)       2        0
...
           48  AVG_SALE97       N (   Numeric)       7        2

Coordinate system definition: "GEOGCS["GCS_North_American_1983",
DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137,298.257222101]],
PRIMEM["Greenwich",0],UNIT["Degree",0.0174532925199433]]"


Importing a shapefile using the simplest approach

The db2se import_shape Spatial Extender command provides an easy way to import shapefiles. You can write this command into an SQL script file, such as db2 -tvf import_counties.sql. Listing 3 shows a simple example.


Listing 3. Contents of import_counties.sql
!erase counties.msg;

!db2se import_shape sample
-fileName         counties.shp
-srsName          nad83_srs_1
-tableSchema      test
-tableName        counties
-spatialColumn    shape
-client           1
-messagesFile     counties.msg
;

!db2se register_spatial_column sample
-tableSchema      test
-tableName        counties
-columnName       shape
-srsName          nad83_srs_1
;

connect to sample;

create index test.countiesidx on test.counties(shape)
extend using db2gse.spatial_index(.5, 2.0, 0);

runstats on table test.counties and indexes all;

select count(*) from test.counties;

describe table test.counties;

Notes:

  1. Operating system commands can be included in a DB2 script if they are prefixed by an exclamation mark.
  2. If the message file exists, it is erased with the erase counties.msg command. The import_shape operation will terminate if the message file already exists.
  3. Most of the parameters of the db2se import_shape command are fairly intuitive. The table with the specified schema and name is created with attribute columns corresponding to the attributes of the shapefile. If the table already exists, the operation will terminate. The spatial column name is arbitrary, but shape is commonly used.
  4. The -client 1 parameter is important. This command runs the import operation on the DB2 client, accessing the import file locally. If the parameter is defined as -client 0, the import operation will run on the server and requires the file specification to be a location on the server that is available to the DB2 server.
  5. The db2se register_spatial_column command creates a constraint that all spatial values use the SRS nad83_srs_1. It also creates an entry in the db2gse.st_geometry_columns view that associates this SRS with this spatial column. This is helpful and sometimes required by some visualization applications.
  6. The connect to sample code connects to the sample database.
  7. A spatial index is created with the create index command and reasonable values. Spatial indexes are discussed in a later section.
  8. The runstats command updates the table statistics, which is always a good idea after loading or after making significant updates.
  9. The select count(*) command lets you gather how many rows were imported.
  10. The describe table command helps you gather the table columns created.

Importing a shapefile using a slightly more complex example

Sometimes it is necessary to implement more flexibility in the import process. In particular, you might want to import into different but compatible data type columns, select a subset of columns to import, or use column names different than the shapefile attribute names. Listing 4 illustrates these capabilities.


Listing 4. Contents of import_counties2.sql
!erase counties.msg;
connect to sample;

drop table test.counties;

create table test.counties(
   county_name    varchar(32)
  ,state_name     varchar(25)
  ,state_fips     varchar(2)
  ,county_fips    varchar(3)
  ,area           double
  ,population2000 integer
  ,shape          db2gse.st_multipolygon
  )
  ;

!db2se import_shape sample
-fileName         counties.shp
-inputAttrColumns N(name,state_name,state_fips,cnty_fips,area,pop2000)
-srsName          nad83_srs_1
-tableSchema      test
-tableName        counties
-tableAttrColumns county_name,state_name,state_fips,county_fips,area,population2000
-createTableFlag  0
-spatialColumn    shape
-typeSchema       db2gse
-typeName         st_multipolygon
-messagesFile     counties.msg
-client 1
;

!db2se register_spatial_column sample
-tableSchema      test
-tableName        counties
-columnName       shape
-srsName          nad83_srs_1
;

create index test.countiesidx on test.counties(shape)
extend using db2gse.spatial_index(.5, 2.0, 0);

runstats on table test.counties and indexes all;

select count(*) from test.counties;

describe table test.counties;

Notes:

  1. The table is created with the desired columns, data types, and column names.
  2. The -inputAttrColumns parameter specifies the names of the attributes in the shapefile to be imported.
  3. The -tableAttrColumns parameter specifies the names of the table columns into which to import data. Note that the -inputAttrColumns and -tableAttrColumns names are different.
  4. The -createTableFlag parameter indicates that the table already exists and should not be created.
  5. The -typeSchema and -typeName parameters specify the type of the spatial column. One of the peculiarities of the shapefile format is that it does not specify whether the spatial type is a collection (multi), so import_shape defaults to the multi type. For the example, specifying multi type isn't necessary because the county polygons actually are MultiPolygons. But for some applications, the data to be imported is not a multi type, and you can import it into a non-multi column.

3 of 11 | Previous | Next

Comments



static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=793988
TutorialTitle=Manage spatial data with IBM DB2 Spatial Extender, Part 1: Acquiring spatial data and developing applications
publish-date=02162012
author1-email=dadler@us.ibm.com
author1-email-cc=