Setting up geocoding operations
Db2® Spatial Extender lets you set, in advance, the work that must be done when a geocoder is invoked.
Before you begin
- DATAACCESS authority on the database that contains the tables that the geocoder will operate on
- CONTROL privilege on each table that the geocoder operates on
- SELECT privilege and UPDATE privilege on each table that the geocoder operates on
About this task
- What column the geocoder is to provide data for.
- Whether the input that the geocoder reads from a table or view should be limited to a subset of rows in the table or view.
- The range or number of records that the geocoder should geocode in batch sessions within a unit of work.
- Requirements for geocoder-specific operations. For example, a geocoder may geocode only those records that match their counterparts in the reference data to a specified degree or higher. This degree is called the minimum match score.
You must specify the parameters in the preceding list before you set up the geocoder to run in automatic mode. From then on, each time the geocoder is invoked (not only automatically, but also for batch runs), geocoding operations will be performed in accordance with your specifications. For example, if you specify that 45 records should be geocoded in batch mode within each unit of work, a commit will be issued after every forty-fifth record is geocoded. (Exception: You can override your specifications for individual sessions of batch geocoding.)
You do not have to establish defaults for geocoding operations before you run the geocoder in batch mode. Rather, at the time that you initiate a batch session, you can specify how the operations are to be performed for the length of the run. If you do establish defaults for batch sessions, you can override them, as needed, for individual sessions.
Procedure
To set up geocoding operations:
- Issue the db2se setup_gc command.
- Run an application that calls the DB2GSE.ST_SETUP_GEOCODING procedure.
What to do next
The minimum degree to which the street names must match is referred to as spelling sensitivity. The minimum degree to which the entire addresses must match is called the minimum match score. For example, if the spelling sensitivity is 80, then the match between the street names must be at least 80 percent accurate before the geocoder will search for the entire address. If the minimum match score is 60, then the match between the addresses must be at least 60 percent accurate before the geocoder will geocode the record.
You can specify what the spelling sensitivity and minimum match score should be. Be aware that you might need to adjust them. For example, suppose that the spelling sensitivity and minimum match score are both 95. If the addresses that you want geocoded have not been carefully validated, matches of 95 percent accuracy are highly unlikely. As a result, the geocoder is likely to return a null when it processes these records. In such a case, it is advisable to lower the spelling sensitivity and minimum match score, and run the geocoder again. Recommended scores for spelling sensitivity and the minimum match score are 70 and 60, respectively.
- You invoke the geocoder to geocode addresses in a table in batch mode. Unfortunately, the minimum match score is too high, causing the geocoder to return a null when it processes most of the addresses. You reduce the minimum match score when you run the geocoder again. To limit its input to those addresses that were not geocoded, you specify that it should select only those rows that contain the null that it had returned earlier.
- The geocoder selects only rows that were added after a certain date.
- The geocoder selects only rows that contain addresses in a particular area; for example, a block of counties or a state.
- You have less control over the size of the unit of work than the former alternative affords. Consequently, you cannot control how many locks are held or how many log entries are made as the geocoder operates.
- If the geocoder encounters an error that necessitates a rollback, you need to run the geocoder to run against all the records again. The resulting cost in resources can be expensive if the table is extremely large and the error and rollback occur after most records have been processed.