Visualize spatial data in DB2

Use ArcExplorer to access DB2 for Linux, UNIX, and Windows and DB2 for z/OS

Visualizing spatial and geometric data stored and managed in DB2 databases is often crucial for users to understand the data itself. Jointly with ESRI, IBM® provides the no-charge ArcExplorer™ for such tasks. Learn how to set up and use ArcExplorer on various platforms. The data to be displayed can be stored in a spatially-enabled DB2® database hosted on Linux®, UNIX®, or Windows®. Also learn how to access spatial data stored on DB2 for z/OS using the spatial support feature available with Version 9.

Share:

Knut Stolze (stolze@de.ibm.com), DB2 z/OS Utilities Development, IBM Germany

Photo: Knut StolzeKnut Stolze started his work with DB2 when he joined IBM as a visiting scientist at the Silicon Valley Lab where he worked on the DB2 Image Extender. He moved on to the DB2 Spatial Extender Version 8 and was responsible for several enhancements to improve the usability, the performance, and the standard-conformance of the Extender for more than two years.

Currently, he works as a teaching assistant at the University of Jena, Germany, and continues his work for IBM in the area of federated databases. He can be reached through the newsgroups comp.databases.ibm-db2 and ibm.software.db2.udb.spatial or at stolze@de.ibm.com.



03 May 2007

Introduction

Virtually every database contains spatial information. Address information is the most prominent representative. Other examples include streets, which can be represented in the database not only by their name but also by their exact location on Earth. Linestrings are the means to model such objects. Locations of branches and customer residences can be described with points. Similarly, state and county boundaries can be represented with polygons. Non-geographic applications can also benefit from spatial data stored in a relational database. Warehouses and the placement of products in shelves can be handled with explicitly modeled geometric information. Distance calculations in space offer the capabilities to optimally schedule the placement of products in the warehouse, depending on various criteria.

DB2 for Linux, UNIX, and Windows supports (since Version 7) the storage, management, and retrieval of spatial data in databases using the DB2 Spatial Extender product. The extender allows you to model geometric primitives like points, lines, and polygons as well as collections of those primitives in the same way as any other relational data (for instance, numbers and strings) using dedicated data types. The DB2 Spatial Extender does not only provide data types but also comes with a wide variety of spatial functions and methods to perform spatial operations. For example, you can determine distances between two geometries, perform spatial set operations like unioning or intersecting geometries, and retrieving the length of a linestring.

Spatial support for DB2 for z/OS provides mostly the same functionality as the extender. There are differences in the internal implementation, but the usage of spatial types and functions is similar. The major distinction is that spatial functions in DB2 for Linux, UNIX, and Windows can usually be invoked with the method notation (for example, geometry..ST_Buffer(10, 'METER')..ST_AsText()). DB2 for z/OS does not yet support structured types and methods (used to implement the extender), so that spatial support on that platform only accepts the functional invocation syntax (for example, ST_AsText(ST_Buffer(geometry, 10, 'METER'))).

Although spatial computations and comparisons are sufficient for many application, it is sometimes also very helpful for humans to visualize the data in your database. A tabular representation -- which is used for traditional relational data -- is not the most suitable way to make spatial information accessible. Instead, spatial data requires the rendering of maps. There are many different ways to extract geometries and generate such a map. Some commercial tools -- Geographic Information Systems (GIS) -- are solely dedicated to view and modify spatial and non-spatial data. Other systems offer a lean interface by generating still images (like JPEG or TIFF) from geometries. Those images can later on be displayed by a Web browser, for example.

The ESRI ArcExplorer that you can download from the DB2 Spatial Extender Web site (see the Resources section) is implemented in Java™. Thus, it can be used on any platform with a suitable Java Virtual Machine (JVM). However, ArcExplorer comes with a Windows-specific installation routine. This is an obstacle to deploying and using ArcExplorer on other platforms, like Linux and AIX®. The following sections describe how to overcome this issue. An ArcExplorer is set up on a Linux system and spatial data stored in a DB2 database on the same system is accessed, as well as another database maintained on DB2 for z/OS.

The article first explains how to install, start, and set up ArcExplorer. Next, it lays out the requirements to be followed to get access to data stored in the database. Finally, the article gives some guidelines for the efficient use of ArcExplorer. Everything is illustrated based on screenshots taken on a Linux system. However, the same steps can be applied to other systems as well.

Deploy ArcExplorer

ArcExplorer is a light-weight tool for visualizing spatial data. It is comprised of five .jar files, which you will find in an archive in the Download section. If you compare that to a native install on Windows, you may notice that the Windows installation is more complex. On Windows, a fully working product is installed, which includes the Java Runtime Environment (JRE). These days, it can be assumed that each system has Java installed already. Therefore, you can rely on existing JRE and JVM. That helps to reduce the footprint of the installation image, which becomes a mere 2.5MB -- from about 100MB with the original Windows installation. At the same time, ArcExplorer automatically benefits from updates to the existing Java package.

Similar to the JVM, a copy of the db2java.zip file is not required to exist in the ArcExplorer installation directory. This file is essential for the communication of ArcExplorer with DB2. Although the README instructs you to copy that file from the DB2 installation directory, it is simpler and better to directly use the file from its original location, such as ~/sqllib/java/db2java.zip. You can take advantage of that by simply referencing it in the CLASSPATH environment variable.

To summarize, the only files that are needed to operate the ESRI ArcExplorer are aejava_help.jar, arcims_aej.jar, arcims_sdk.jar, sde_sql3.jar, and xml.jar. Additionally, the download provides the shell script arcexplorer that sets up the classpath and starts the JVM with ArcExplorer.

There are some other files you will find in the download. The file aejava.ico is an image that you can use as an icon in menus if you want to start ArcExplorer that way. The file LICENSE contains the license accompanying ArcExplorer, to which the user has to adhere. The script host.sql contains SQL statements that use federation to establish the connection to a remote DB2 for z/OS subsystem and to access three remote tables with spatial data through nicknames. Finally, the file import.jcl is a JCL script that you can use to import spatial data from shapefiles into your DB2 for z/OS subsystem. The shell script import.sh does the same on DB2 for Linux, UNIX, and Windows.

With all that precursory information, you can now download the provided ZIP archive and unpack its content into a directory of your choice. To start ArcExplorer, go to that directory and invoke the arcexplorer script as illustrated in Listing 1. You will see the (still unconfigured) ArcExplorer coming up, as in Figure 1.

Listing 1. Starting ArcExplorer
$ cd /usr/local/arcexplorer
$ ./arcexplorer
Figure 1. Started ArcExplorer
Started ArcExplorer

Prepare spatial data in the database

Now that you've verified the basic functionality of the Java application, this section explains how to set up your database, import spatial data from the sample shapefiles, and finally visualize the data.

Spatially enabling the database

In order to manage spatial data in your database, the database must be enabled for spatial operations. You can use the tool db2se to initiate the enabling process on DB2 for Linux, Unix, and Windows, as Listing 2 demonstrates.

Listing 2. Enable the database
$ db2 create database spatial
DB20000I  The CREATE DATABASE command completed successfully.
$ db2se enable_db spatial
Enabling database. Please wait ...
GSE1036W  The operation was successful.  But values of certain
database manager and database configuration parameters should be
increased.

You can ignore the warning GSE1036W for now. It is due to the enable-step, which verifies that some database manager (DBM) configuration parameters are not below a certain minimum to ensure the proper function of the extender in more complex scenarios. Since only simple queries are used henceforth, it is not mandatory to adjust those parameters now.

Spatially enabling a DB2 for z/OS subsystem requires more steps to be executed. The database administrator has to modify the JCL scripts to enable the database. For example, the name of the DB2 subsystem needs to be used. Once the scripts are adjusted, the scripts are executed and the DB2 subsystem is enabled for the spatial support feature. The details of this process go beyond the scope of this article. Refer to the documentation (see the Resources section) instead.

Import spatial data

Three of the sample shapefiles are used -- namely sjMainStreets, sjZipCodes, and sjCensusBlocks -- that are shipped with the DB2 Spatial Extender to illustrate how to use ArcExplorer. The first step is to import those shapefiles into the spatially enabled database. This is discussed separately for DB2 for Linux, UNIX, and Windows and DB2 for z/OS, because the process differs significantly.

DB2 for Linux, UNIX, and Windows

Listing 3 shows how to import a shapefile using the db2se command line tool. Only the shapefile sqMainStreets is imported into the database named SPATIAL. The commands to import the other two shapefiles can be found in the import.sh file, found in the Download section. The same functionality is available in the DB2 Control Center, of course. You can also initiate the import from your own applications by calling the db2gse.ST_import_shape stored procedure. For more information, refer to the Spatial Extender documentation.

Listing 3. Import sample shapefiles into DB2 for Linux, UNIX, and Windows
$ db2se import_shape SPATIAL -filename sjMainStreets -srsName NAD83_SRS_1 \
        -tableName MAIN_STREETS -createTableFlag 1 -spatialColumn SHAPE \
        -idColumn SE_ROW_ID -commitScope 1000 -client 1
GSE0000I  The operation was completed successfully.

$ db2se register_spatial_column SPATIAL -tableName MAIN_STREETS \
        -columnName SHAPE -srsName NAD83_SRS_1
GSE0000I  The operation was completed successfully.

There are two things to note with respect to the import operation. ArcExplorer mandates that a column named SE_ROW_ID exists in the table. That column must have the declared type INTEGER. The values in this column are used to uniquely identify each row, that is it must be an ID column. Choose to let shape import automatically add the column and generate unique values for it (option idColumn). The second is that ArcExplorer requires all spatial columns to have a single spatial reference system assigned. This is accomplished and enforced by registering the spatial column, as done in the last step of Listing 3.

There are other alternatives to provide the mandatory SE_ROW_ID column. For example, you can use a view over your spatial table and derive the SE_ROW_ID values from an already existing ID column by simply renaming the column in the view definition. Another approach could be to generate values for a new SE_ROW_ID column using various numbering techniques like the online analytical processing (OLAP) function ROW_NUMBER(). Implementing this is left as an exercise to the interested reader.

DB2 for z/OS

Spatial support for DB2 for z/OS provides a tool similar to db2se. It is named DSN5SCLP and -- because it runs on z/OS -- has to be started from a JCL script. Shapefiles to be imported must reside in a hierarchical file system (HFS) data set on z/OS. The script in Listing 4 assumes that the mount point is named /u/stolze/. You may have to adjust that in your own environment. The shapefiles are stored as binary data. Thus, all data in the .dbf files, which contain non-spatial attribute information, is still encoded in ASCII format. (On the other hand, the generated messages file store all messages from the import operation in EBCDIC.)

The same shapefiles as before are used on DB2 for Linux, UNIX, and Windows. Listing 4 shows how to import the shapefile sjMainStreets. The other two shapefiles can be imported in the same manner with adjusted file names and table names. You will find the complete JCL script import.jcl for all three tables in the Download section.

Listing 4. Import sample shapefiles into DB2 for z/OS
//IMPORT JOB 'USER=$$USER','',CLASS=A,MSGCLASS=H,MSGLEVEL=(1,1)
//JOBLIB  DD  DSN=DB2A.SDSNEXIT,DISP=SHR
//        DD  DSN=DB2A.SDSNLOAD,DISP=SHR
//        DD  DSN=CEE.SCEERUN,DISP=SHR
//        DD  DSN=CBC.SCLBDLL,DISP=SHR
//*
//* IMPORT SHAPE FILES
//*
//IMPORT   EXEC PGM=IKJEFT01,REGION=0M,DYNAMNBR=20
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//DSNAOINI DD DSN=DB2A.SDSNSAMP(DSNAOINI),DISP=SHR
//CEEDUMP  DD SYSOUT=*
//SYSTSIN  DD *

DSN5SCLP /import_shape STLEC1 +
   -fileName /u/stolze/sjMainStreets +
   -createTableFlag 1 +
   -srsName NAD83_SRS_1 +
   -tableName MAIN_STREETS +
   -tableCreationParameters "IN DATABASE SPATIAL" +
   -spatialColumn SHAPE -idColumn SE_ROW_ID +
   -messagesFile /u/stolze/mainStreets.msg

//*
//* REGISTER SPATIAL COLUMNS
//*
//REGISTER EXEC PGM=IKJEFT01,REGION=0M,DYNAMNBR=20
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//DSNAOINI DD DSN=DB2A.SDSNSAMP(DSNAOINI),DISP=SHR
//CEEDUMP  DD SYSOUT=*
//SYSTSIN  DD *

DSN5SCLP /register_spatial_column STLEC1 +
   -tableName MAIN_STREETS -columnName SHAPE -srsName NAD83_SRS_1

//*

DSN5SCLP requires that you specify the DB2 location name -- STLEC1 in the listing -- for each operation, where db2se expected the name of a database. This is needed to establish a connection to the respective DB2 subsystem. The only other noticeable difference is that an additional clause can and should be appended to the CREATE TABLE statement to specify in which database the table shall be placed. The option tableCreationParameters is used to provide the information for database SPATIAL.

The JCL script does not only import spatial data but also registers a dedicated SQL Reporting Service (SRS) for the spatial columns in those tables. It is the same as in the preceding section. This step is not really necessary because ArcExplorer does not access DB2 for z/OS natively. Instead, a federation layer is involved, as explained in more detail in the next section.

Set up federation to access spatial data on DB2 for z/OS

Unfortunately, it is not an option to let ArcExplorer access the spatial data in DB2 for z/OS directly. The explorer relies on catalog views known from DB2 for Linux, UNIX, and Windows and also uses (internally) a DB2 for Linux, UNIX, and Windows specific syntax to deal with spatial data, namely method invocations. While it is possible to simulate the required catalog views from DB2 for Linux, UNIX, and Windows, the query syntax turns out to be a show-stopper.

DB2's federated capabilities provide a solution for us. We keep the spatial and non-spatial data in DB2 for z/OS, but use a DB2 for Linux, UNIX, and Windows instance as a federated server. This federated server transparently accesses the data in DB2 for z/OS through nicknames, while providing the desired catalog views and accepting the syntax for method invocations.

However, a work-around is needed to transfer the spatial data and convert it between the different internal representations in DB2 for z/OS and DB2 for Linux, UNIX, and Windows. The DRDA wrapper, which implements access to DB2 data sources, does not yet provide this conversion functionality. Therefore, you employ views for the conversion steps by transferring the spatial data in its well-known binary (WKB) encoding. Figure 2 shows the architecture of how ArcExplorer accesses the spatial data stored in DB2 for z/OS.

Figure 2. Federated setup
Federated setup

The arrows indicate the flow of spatial and non-spatial data through the system. A query from ArcExplorer is always directed against DB2 for Linux, UNIX, and Windows. More specifically, ArcExplorer accesses a view that has a (registered) spatial column. It is completely hidden from ArcExplorer where the actual data originates from. The view itself is based on a nickname. That nickname has a BLOB column with the geometries available in their WKB representation. The BLOB values are retrieved upon request by the federated server from the remote data source, which is the DB2 for z/OS subsystem. There, the wrapper does not access the base table MAIN_STREETS with the geometry information, but rather the view MAIN_STREET_VIEW. This view is introduced to inject a call to the function ST_AsBinary to convert the spatial data from the base table to its WKB representation.

The illustrated architecture is realized with a series of SQL statements that are summarized in Listing 5. The listing only reflects the setup for the table MAIN_STREETS and making it available in DB2 for Linux, UNIX, and Windows through the view MAIN_STREETS_HOST. Other tables, like ZIP_CODES and CENSUS_BLOCKS, can be handled in the same fashion. The SQL script host.sql, which you will find in the Download section, contains the SQL statements for all three tables.

Listing 5. Set up federated access to DB2 for z/OS
CREATE WRAPPER drda;

CREATE SERVER host TYPE DB2/ZOS VERSION '9' WRAPPER DRDA
   AUTHORIZATION sysadm PASSWORD password OPTIONS ( DBNAME 'HOST' );

CREATE USER MAPPING FOR USER SERVER host
   OPTIONS ( REMOTE_AUTHID 'SYSADM', REMOTE_PASSWORD 'PASSWORD' );

SET PASSTHRU host;
CREATE VIEW main_streets_view AS
   SELECT census1, census2, cfcc, cfcc1, cfcc2, fedirp, fedirs, fename,
          fetype, INTEGER(fnode) AS fnode, fraddl, fraddr, length,
          se_row_id, source, tlid, INTEGER(tnode) AS tnode, toaddl,
          toaddr, zipl, zipr,
          db2gse.ST_AsBinary(shape) AS wkb, db2gse.ST_SrId(shape) AS srsId
   FROM   main_streets;
SET PASSTHRU RESET;

CREATE NICKNAME main_streets_nick
   FOR host.sysadm.main_streets_view;

CREATE VIEW main_streets_host AS
   SELECT census1, census2, cfcc, cfcc1, cfcc2, fedirp, fedirs, fename,
          fetype, fnode, fraddl, fraddr, length, se_row_id, source, tlid,
          tnode, toaddl, toaddr, zipl, zipr,
          db2gse.ST_MultiLineString(wkb, srsId) AS shape
   FROM   main_streets_nick;

! db2se register_spatial_column spatial -tableName MAIN_STREETS_HOST
        -columnName SHAPE -srsName NAD83_SRS_1;

The total effect of this setup is that all data is physically stored in DB2 for z/OS only, and that all access is directed to DB2 for Linux, UNIX, and Windows only. Upon query time, DB2 for Linux, UNIX, and Windows transforms the query to suit DB2 for z/OS and also convert the spatial data with the views along the way.

Visualize spatial data

With all previous preparations being completed, you can now move on to render maps from the spatial data. First, you want to add a layer. A layer is a collection of spatial objects (and their non-spatial attributes) of a common type. For example, all streets may form a street layer. In terms of a relational DBMS, a layer corresponds to a spatial table, that is a table that has one geometry column and additional non-geometry columns. All rows in a table are entities of a certain entity type that is represented by that table. Values in the additional columns describe properties (like street names) for each single geometry.

Adding a layer in ArcExplorer can be done either through the menu (Layer > Add Layers), or by clicking the Add Layers icon in the toolbar. The first step in the newly opened Catalog window is to add a database connection. The database you want to connect to must be cataloged in the local DB2 (client) instance. It is also mandatory that you specify a username and password. Figure 3 shows this step.

Figure 3. Set up DB2 connection
Set up DB2 connection

The result of the previous step is a new connection object. Setting up the connection has only to be done once for each database you want to access. If you use the federated setup from above, you only need a connection to the federated server, that is DB2 for Linux, UNIX, and Windows. Adding new layers for existing database connections can be done right away. You select the connection and get a list of all layers (registered spatial columns) in the associated database. For this scenario, select all three layers, click the Add Layers icon and close the Catalog window. Figure 4 demonstrates this. Note that the nicknames for the spatial tables in DB2 for z/OS had not been added, so they will not show up here.

Figure 4. Add layers
Adding layer

On the left panel of the main window, you can now select the layers to be rendered, choose the color and other properties to be used for the layer, zoom in and out, identify single geometries and retrieve associated attributes, as well as creating an image from the currently selected map area. You can also move layers up and down (by right-clicking on them in the left-hand side overview). This influences the order in which layers are drawn and, thus, which layers appear on top of others. Figure 5 demonstrates how the Identify icon was used to show information about a selected census block. The layer of interest has to be selected on the left-most panel before the Identify feature can be used.

Figure 5. Retrieve attributes for geometries
Retrieve attributes for geometries

Exactly the same operations can be performed on spatial data from a DB2 for z/OS subsystem. Your federated setup hides the origin of the data itself. ArcExplorer only works with a DB2 for Linux, UNIX, and Windows system. Figure 6 illustrates the results from selecting the layer ZIP_CODES_HOST and rendering it. There are no functional differences to layers from base tables stored in the DB2 for Linux, UNIX, and Windows database.

Figure 6. Render spatial data from DB2 for z/OS
Render spatial data from DB2 for z/OS

Performance considerations for ArcExplorer

ArcExplorer incorporates the Query Builder feature. That allows you to filter geometries to be displayed based on various predicates. Figure 7 illustrates this for all 17 census blocks where column HOUSHO_29 contains the value 6.

Figure 7. Filter geometries using ArcExplorer's Query Builder
Filter geometries

Analyzing the CLI trace shows that predicates specified in the query builder are not pushed down to DB2. ArcExplorer loads all the data into main memory, and all filtering is applied there. While this may be sufficient for small amounts of geometries and just a few tables, it quickly leads to performance problems with large tables. As a consequence, the initial loading of an ArcExplorer project can result in long delays.

You can easily overcome this inefficiency at the database level. To that end, you define a view in DB2 and register the spatial column of that view. Subsequently, you make the view known to ArcExplorer as a layer. Listing 6 summarizes this setup.

Listing 6. Filter at database level
$ db2 CREATE VIEW census_blocks_6 AS \
        SELECT blockgroup, population, househo_28, househo_29, per_capita, \
               median_hou, median_fam, average__1, average__2, se_row_id, \
               shape \
        FROM   census_blocks \
        WHERE  househo_29 = 6
DB20000I  The SQL command completed successfully.

$ db2se register_spatial_column spatial -tableName CENSUS_BLOCKS_6 \
        -columnName SHAPE -srsName NAD83_SRS_1
GSE0000I  The operation was completed successfully.

Add the new layer after refreshing the list for the database connection. The new layer is selected for display, and the result in Figure 8 is the same as before when ArcExplorer performed the filtering itself. Now that the view injects the filter predicate at the SQL level, DB2 can take advantage of the usual database techniques, that is index access.

Figure 8. Filter geometries at database level
Filter geometries at database Level

Another performance consideration becomes relevant if there are spatial tables that will not be accessed with ArcExplorer at all. Spatial columns in such tables should not be registered. If they are, then ArcExplorer will access and scan those tables upon startup or when a new layer is to be added. This is done to provide a consistent and complete view of the available data. Unregistering such tables avoids the mentioned scans because ArcExplorer will not be aware of the existence of those tables.

Summary

This article provided a introduction on how to use the ESRI ArcExplorer to visualize spatial data managed with the DB2 Spatial Extender or with Spatial Support for DB2 for z/OS. It explained how to import shapefiles in either DB2 system and prepare it so that ArcExplorer can access the data. The access itself was demonstrated afterward and the basic functionality of ArcExplorer introduced. The last section discussed how to apply filter predicates at the database level to improve query performance and response times.


Download

DescriptionNameSize
ArcExplorer and sample code1arcexplorer.zip2.2MB

Note

  1. Contains ArcExplorer and sample scripts host.sql, import.jcl, and import.sh.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=216794
ArticleTitle=Visualize spatial data in DB2
publish-date=05032007