Tutorial 2: Creating new maps

This is the second of a series of three tutorials that show you how to analyze geospatial data with Db2® and Esri ArcGIS:

This tutorial shows you how to carry out the following tasks:

  • Create a new map.
  • Obtain information about an area on the map.

Time required

10 minutes

Scenario

You are a data scientist working for a large insurance company, and you want to analyze occurrences of tornadoes in the conterminous United States. You want to determine which counties have the most tornadoes. You also want to correlate tornado occurrences with customer data to determine in which counties your company is exposed to the highest level of risk from potential tornado damage settlements.

Difficulty

Beginner

Audience

Data scientists

System requirements

  • Windows 7 Ultimate, Professional and Enterprise (32 bit and 64 bit [EM64T])
  • Windows 8 Pro and Enterprise (32 bit and 64 bit [EM64T])
  • Windows 8.1 Pro and Enterprise (32 bit and 64 bit [EM64T])

Prerequisites

This scenario uses an Esri ArcGIS for Desktop installation that is connected to Db2 to visualize and analyze geospatial data (see Connecting Esri ArcGIS for Desktop). The data for this scenario is already loaded into your Db2 database. This scenario assumes that you already know how to work with base maps.

Creating a new map that shows tornado density

Our first query will be used to create a map of tornadoes that occurred in the year 2009 or later. The corresponding SQL query will be transferred to Db2 and run there as an in-database query, which lets you benefit from the performance and scalability of Db2.

Procedure

  1. We'll begin by returning the map to its base state. To do this, clear the check boxes for the tornado and county query layers. Because the current map scale is greater than the minimum scale setting for the customer query layer, the data points for that query layer are not displayed, so there's no need to clear its check box. Then, click File > Add Data > Add Query Layer.
  2. In the window that is displayed, select the connection to the Db2 database. The window then lists the tables in the database. Assign a name to the new layer. Here, we will use the name COUNTY_TORNADO_DENSITY. Then, enter the SQL query in the query field:

SELECT COUNTY.OBJECTID, COUNTY.NAMELSAD, COUNTY.SHAPE, 
       COALESCE(tornadoes_PER_COUNTY.TORNADO_COUNT,0) 
       / ((COUNTY.ALAND + COUNTY.AWATER) 
       * DECIMAL('3.86102159E-7', 20, 15)) AS COUNTY_TORNADO_DENSITY
       FROM SAMPLES.GEO_COUNTY AS COUNTY
       LEFT OUTER JOIN (
       SELECT COUNTY.OBJECTID, COUNT(COUNTY.OBJECTID) AS TORNADO_COUNT
       FROM SAMPLES.GEO_COUNTY AS COUNTY, SAMPLES.GEO_TORNADO AS TORNADO
       WHERE INT(TORNADO.YR) >= 2009
       AND DB2GSE.ST_INTERSECTS(TORNADO.SHAPE, COUNTY.SHAPE) = 1 SELECTIVITY 0.00001
       GROUP BY COUNTY.OBJECTID) AS tornadoes_PER_COUNTY
       ON COUNTY.OBJECTID = tornadoes_PER_COUNTY.OBJECTID

  1. Click Validate to verify that the query is entered correctly. Click Finish, then click Use Spatial Reference Extent.
    Validating the query layer
  2. To see a table that shows the result of the query, right-click on the query layer and select Open Attribute Table.
    Selecting Open Attribute Table from the query layer menu
  3. The attribute table displays information about each county, including the new information about tornado density that was calculated by the query.
    The attribute table
  4. You can add the information about tornado density to the map by color-coding the counties. To do this, right-click the query layer and select Properties.
    Selecting Properties from the query layer menu
  5. Select the Symbology tab, then select Quantities from the list. On the Graduated colors page, the Value field specifies which column of the query layer output is to be used to determine the color and shading of each county. Select COUNTY_TORNADO_DENSITY. In the Color Ramp field, you can choose among different predefined color gradients. In this scenario, we choose the blue color gradient. We want there to be 4 density classes, so we set the number of classes to 4 and confirm by clicking OK.
    Specifying the color that visualizes the tornado density
  6. Now, the map shows the tornado density for each county.
    The tornado density of the counties
  7. Use the pan function, remember to press and hold the Shift key, to zoom in for a closer look. When you're done, set the scale back to 1:25000000.
    Zooming into a selected area

Creating a new map that indicates high risk areas

It might also be interesting for an insurance company to identify high risk counties, that is, counties that have both a high tornado density, and therefore a high risk of tornado damage, and a high total value of insured property. To do this, we can create a new query layer that joins data from the customer, tornado, and county tables, and uses that data to calculate a risk factor. In this scenario, we will calculate a simple risk factor based on the tornado density and the total value of all policies.

Procedure

  1. Begin by clearing the check box for all query layers in the Table of Contents, except for the base map. Add a query layer, select the connection, and specify the name COUNTY_TORNADO_RISK. Then, enter the SQL query in the query field:


       SELECT COUNTY.OBJECTID, COUNTY.SHAPE, COUNTY.NAMELSAD, COALESCE(CUSTOMERS,0) CUSTOMERS, 
       COALESCE(tornadoes_PER_COUNTY.TORNADO_COUNT,0) 
       / ((COUNTY.ALAND + COUNTY.AWATER) 
       * DECIMAL('3.86102159E-7', 20, 15)) 
       * COALESCE(CUSTOMERS,1) AS COUNTY_TORNADO_RISK
       FROM SAMPLES.GEO_COUNTY COUNTY 
       LEFT OUTER JOIN (
       SELECT COUNTY.OBJECTID, COUNT(COUNTY.OBJECTID) AS TORNADO_COUNT
       FROM SAMPLES.GEO_COUNTY AS COUNTY, SAMPLES.GEO_TORNADO AS TORNADO
       WHERE INT(TORNADO.YR) >= 2009
       AND DB2GSE.ST_INTERSECTS(TORNADO.SHAPE, COUNTY.SHAPE) = 1 SELECTIVITY 0.00001
       GROUP BY COUNTY.OBJECTID) AS tornadoes_PER_COUNTY
       ON COUNTY.OBJECTID = tornadoes_PER_COUNTY.OBJECTID
       LEFT OUTER JOIN (
       SELECT COUNTY.OBJECTID, COUNT(CUSTOMER.OBJECTID) AS CUSTOMERS
       FROM SAMPLES.GEO_COUNTY AS COUNTY, SAMPLES.GEO_CUSTOMER AS CUSTOMER
       WHERE CUSTOMER.INSURANCE_VALUE > 2000000
       AND DB2GSE.ST_INTERSECTS(CUSTOMER.SHAPE, COUNTY.SHAPE) = 1 SELECTIVITY 0.00001
       GROUP BY COUNTY.OBJECTID) AS CUSTOMERS_PER_COUNTY
       ON COUNTY.OBJECTID = CUSTOMERS_PER_COUNTY.OBJECTID
        

  1. Open the Layer Properties window to specify how the results are to be displayed. The column of the query layer output that is to be used to determine the color and shading of each county is also called COUNTY_TORNADO_RISK. In the Color Ramp field, choose the red color gradient. Specify that four classes are to be used. Then, click OK.
    Specifying the visualization of high risk areas
  2. In the resulting map, the darkness of each county indicates the degree to which it has a high tornado risk combined with a high total value of insurance policies.
    A map displaying high risk areas in dark red tone
    By toggling the check marks in the check boxes of the two query layers, you can switch between the two maps, and compare the results of the two queries.

Getting detailed information on counties

Procedure

  1. If you open the attribute table for the new query layer, you can see, in the CUSTOMERS column, how many customers there are in each county.
    The attribute table of the new query layer
  2. You can use the ArcMap identify tool to view information about an area on the map. First, let's zoom in on an area of the map, then select the identify tool in the menu bar.
    Zooming into an area with several high risk counties
  3. Note the small icon with the letter "i" beside the pointer when it hovers over the map. Use the identify tool the same way you use the zoom-function: Press and hold the Shift-key and the left-mouse-key, and drag the pointer to create a rectangle frame on the map. Then, release the left-mouse-key.
    Selecting an area with the identify tool
  4. A window appears that displays information about each county in the selected area, such as its ID, name, the number of customers, and the tornado density. Each time you select a county from the list, the values in the table are updated and the county flashes green on the map.
    Viewing a selected county
    When you've finished exploring the data for individual counties, close the table and set the map scale back to 1:25000000. Close the identify table and clear all check boxes except the base map.

Tutorial summary

To create new maps, delete all unneeded query layers, and replace them with query layers that provide the information that the new map should contain. Use the identify tool to obtain detailed information about an area on your map.