Geo-routing tasks with PostgreSQL, PostGIS, and pgRouting on IBM Cloud Databases for PostgreSQL

If you need to plan your UK castle tour but don’t know how to start, let us show you how PostgreSQL, PostGIS, and pgRouting on IBM Cloud Databases for PostgreSQL can get you started on this and any other geo-routing tasks you may need your database to perform.

With IBM Cloud Databases (ICD) for PostgreSQL, you have the capability to run complex geospatial queries using PostGIS and pgRouting. In this article, we’ll show you how to get started using pgRouting by getting the shortest paths between castles in the United Kingdom (UK).

While PostGIS is an extension for PostgreSQL, pgRouting can be viewed as an extension for PostGIS. What pgRouting does is extend PostGIS’s functionality by providing you with several functions based on popular network algorithms to solve problems like the shortest path and the traveling salesperson problem.

In order to do that, however, pgRouting must have a valid topology. Take, for example, a road network. Each road segment is joined together at intersections, which we’ll call “nodes.” In order for pgRouting to work, each node must be connected to a road segment so that it can figure out how to get from one node to the next. pgRouting helps us do that using a few functions create network topologies and inspect their validity.

In addition to topology functions, pgRouting gives us various routing functions that we can use to find the shortest path between nodes. For this article, we’ll only look at two of them— pgr_dijkstra (or Dijkstra’s algorithm) and pgr_ksp (or Yen’s algorithm).

Before using these pgRouting functions, we’ll give you a short introduction to setting up a dataset so that it’s capable of using pgRouting because having only a dataset of geographic points is not enough. First, we’ll create a network of nodes using the locations of castles in the UK and connect them together. These connections are called “edges.” From there, we’ll let pgRouting do all the heavy lifting by analyzing our routing topology and computing the shortest routes from one node to the next.

To show you how this is done, let’s first work on a sample dataset of castles in the UK where we’ll create the nodes and edges manually. In the next article, we’ll move on to using an ESRI shapefile that has nodes and edges already established and how to make sure that it has a valid topology.

Getting set up

You’ll need an IBM Cloud Databases (ICD) for PostgreSQL database up and running first. Then, add the PostGIS and pgRouting extensions from the PostgreSQL shell – psql. To do that, we’ll create the PostGIS and pgRouting extensions together using the CASCADE parameter for CREATE EXTENSION that comes with PostgreSQL 9.6. It will automatically download all the extensions the pgRouting depends on:

CREATE EXTENSION pgrouting CASCADE; -- adds pgRouting and PostGIS 2.2.5
SELECT upgrade_postgis_23x(); -- upgrade PostGIS
Scroll to view full table

Now that we have the extensions set up, let’s create a table to hold some data. For this exercise, our table will contain a list of the names and coordinates of 12 popular castles in the UK based on The Telegraph’s twelve best castles in the UK. So, the table will contain three columns named idname, and geomgeom is defined as a geometry data type containing a POINT object since each castle is a point on the map.

    name VARCHAR(75),
Scroll to view full table

Next, we’ll insert the castles into the table. Notice that we use the PostGIS functions ST_SetSRID and ST_Point. We’ll have to use ST_Point so that PostGIS can process the coordinates. Then, we’ll set each coordinate to the Esri code WGS 84 or SRID 4326. For more information on Esri and SRIDs, see the article on spatial reference systems and databases.

insert into castles (name, geom)
('Edinburgh Castle', ST_SetSRID(ST_Point(-3.200833, 55.948611), 4326)),
('Dover Castle', ST_SetSRID(ST_Point(1.3214, 51.1297), 4326)),
('Tower of London', ST_SetSRID(ST_Point(-0.076111, 51.508056), 4326)),
('Warwick Castle', ST_SetSRID(ST_Point(-1.584722, 52.279167), 4326)),
('Caernarfon Castle', ST_SetSRID(ST_Point(-4.2769, 53.1393), 4326)),
('Stirling Castle', ST_SetSRID(ST_Point(-3.948889, 56.123889), 4326)),
('Carrickfergus Castle', ST_SetSRID(ST_Point(-5.806446, 54.713314), 4326)),
('Cardiff Castle', ST_SetSRID(St_Point(-3.1811, 51.4824), 4326)),
('York Castle', ST_SetSRID(ST_Point(-1.08, 53.9558), 4326)),
('Leeds Castle', ST_SetSRID(ST_Point(0.628889, 51.248056), 4326)),
('Lancaster Castle', ST_SetSRID(ST_Point(-2.80562, 54.04981), 4326)),
('Arundel Castle', ST_SetSRID(ST_Point(-0.553611, 50.856111), 4326));
Scroll to view full table

The output of this table will look something like the following:

Map of UK castles

Connecting the castles

Now that we have castles and their locations, we can set up a table that will connect them all together called castle_routes.

    (LEAST( ||, ||
        OVER(ORDER BY LEAST( ||, || AS id, AS name_1, AS name_2,
    ST_MakeLine(X.geom, Y.geom)::GEOMETRY(LineString,4326) AS geom,
    ROUND(ST_Distance(X.geom::GEOGRAPHY, Y.geom::GEOGRAPHY)/1000) AS distance, AS source, AS target
    castles X CROSS JOIN castles Y
Scroll to view full table

What we’re doing above is creating a new table called castle_routes that contains edges that connect each castle to one another, but not to themselves. Using the PostgreSQL window function dense_rank, the idcolumn of our edges table will create a sequence of edges. We define the origin and destination names of each edge with and The geom column contains the line string we created to connect each castle one another. The line strings are also set to SRID 4326, the same as our castle coordinates, using the geometry typemod construct. The distance column contains the length of the edge, which is the distance between one castle and another in kilometers. To get the distance in kilometers, we’ll have to cast our geometry type data to geography. Why? Because the geography data type calculates distances in meters, unlike the geometry data type which uses degrees. Then, we’ll use the ids from the castles table to populate the source and target columns, which indicate the beginning and end nodes of each edge, respectively.

When this SQL query is executed, we’ll get a table containing 66 edges like the following (we’ve kept the geom column out for brevity):


Since we’re creating our own map that has routes connecting from one castle to the next, we’ll have to reconstruct the table of nodes (our castles) based on the edges table we’ve created (castle_routes). To do that, use the pgRouting function pgr_createVerticesTable using the castle_routes as the table name, the geometry column geom, and the source and target column names.

select pgr_createVerticesTable('castle_routes','geom','source','target');
Scroll to view full table

The only name the function requires is the edges table name castle_routes. If you omit geom, the function will look for the_geom as the geometry column name; therefore, it must be defined. source and target also are default names that the function looks for, but for showcasing how the function works, they’ve been added.

The output of running this function will show the process of creating a vertices table with containing an id and the_geom column with the geometries of each castle location. The output will show that it has created 12 vertices from the 66 edges, which corresponds to the number of castles we have.

From this, a new table called castle_routes_vertices_pgr is created that looks like this:


Map of UK castles with vertices

That’s all that’s needed now for the general set up of the routes. Creating your own points and generating the edges is not really necessary if you’re using a shapefile or OpenStreetMap data. But, it’s always best to understand how routes are generated if you decide to generate your own routes from your maps.

Getting the shortest path

Now that we have the routes prepared, we can start finding out the optimal path to get from one castle to the next. To do that, pgRouting provides various functions that’ll calculate the distance and cost of each route, but we’ll focus on perhaps the most basic shortest path function — pgr_dijkstra.

To use the function to calculate the distance from the Tower of London to Stirling Castle, we’d write something like this:

    p.seq, p.node, p.cost, r.geom as edge_geom,
        'SELECT id, source, target, distance AS cost FROM castle_routes',
        (SELECT id FROM castles WHERE name = 'Tower of London'),
        (SELECT id FROM castles WHERE name = 'Stirling Castle'),
    ) AS p
    LEFT JOIN castle_routes AS r ON p.edge =
    LEFT JOIN castles AS c ON p.node =
Scroll to view full table

The first argument of the function takes a string which needs the idsourcetarget, and cost columns from the edge table. Since the castle_routes table doesn’t include a cost column, we’re assigning cost to the distance column. The second and third arguments are the source and target columns for the route. We can either hard code an id, or if we don’t remember the id of the castle, we can select the id of the castle we want using an SQL query like above with a WHERE filter. The fourth argument we provided is set to FALSE since we’re using an undirected graph. If you experiment with the function and set it to TRUE, the function will expect another column in castle_routes called reverse_cost that shows the distance in reverse, which we don’t need here. For maps with road networks or OpenStreetMap data, you’ll usually find the reverse cost added in the data they give you.

The output from this query will give us the node (castle ids), the cost (distance) of 573 kilometers, the castle names, and the edge geometry connecting both castles.

It’s well known that shortest path between two points is a straight line. Selecting the shortest path becomes more interesting, however, if we select other castles we’d like to visit in between the Tower of London and Stirling Castle. Let’s say that we want to determine what the cost of going to Stirling Castle and visiting other castles along the way. For that, we’d use the pgRouting function pgr_ksp. Unlike pgr_dijkstra, it will give you multiple paths to your destination that we can choose from.

Below is an example that will calculate five of the shortest paths between the Tower of London and Stirling Castle. You can substitute the number 5 in the SQL query with any number to get a different number of paths.

SELECT x.path_id, x.path_seq, COALESCE( || ' - ' ||, 'Total Trip') as route,
    WHEN edge = -1 THEN agg_cost ELSE NULL END AS "total_cost(distance)"
        'SELECT id, source, target, distance AS cost FROM castle_routes',
        (SELECT id FROM castles WHERE name = 'Tower of London'),
        (SELECT id FROM castles WHERE name = 'Stirling Castle'),
        directed := TRUE
    ) as x
    LEFT JOIN castle_routes AS r ON x.edge =
    LEFT JOIN castles AS s ON r.source =
    LEFT JOIN castles AS t ON =
    x.path_id, x.path_seq;
Scroll to view full table

Using the pgr_ksp function, we return a path_id that groups the rows containing the same route together. The path_seq column gives us the sequence of the route (i.e., 1, 2, …). For the route column, we’ll show the names of the origin and destination of each route segment with “Total Trip” as the last row of each route. The last column is total_cost, which gives us the total aggregated distance (agg_cost) of a route. We probably don’t want the distance for each segment of a route, so those have been taken out and only the aggregated distance is shown next to the “Total Trip.”

The function pgr_ksp takes as the first argument the same SQL query string to our castle_routes table. Also, we select the origin and destination of our castles in the next two arguments. The number of routes that you want is the next argument. We’ve chosen five, but you can experiment with as many as you’d like. Finally, the last argument we added is directed := TRUE, which means that the route goes from the Tower of London to Stirling Castle. If this is set to FALSE, then we’ll still go to Stirling Castle, but Stirling Castle might not be the end of the route.

Executing this query, we’ll get five of routes with the total distance of each.

Summing up

We’ve introduced some of the basics of pgRouting in this article: nodes, edges, and creating paths between them with pgr_dijkstra and pgr_ksp. Next time, we’ll use ESRI shapefiles to see what else pgRouting can do.

Learn more

Enjoyed this article? Get started with Databases for PostgreSQL now.

Databases for PostgreSQL is a fully managed, enterprise-ready PostgreSQL service with built-in security, high availability, and backup orchestration. Learn more here.

More from Cloud

Modernizing child support enforcement with IBM and AWS

7 min read - With 68% of child support enforcement (CSE) systems aging, most state agencies are currently modernizing them or preparing to modernize. More than 20% of families and children are supported by these systems, and with the current constituents of these systems becoming more consumer technology-centric, the use of antiquated technology systems is archaic and unsustainable. At this point, families expect state agencies to have a modern, efficient child support system. The following are some factors driving these states to pursue modernization:…

7 min read

IBM Cloud Databases for Elasticsearch End of Life and pricing changes

2 min read - As part of our partnership with Elastic, IBM is announcing the release of a new version of IBM Cloud Databases for Elasticsearch. We are excited to bring you an enhanced offering of our enterprise-ready, fully managed Elasticsearch. Our partnership with Elastic means that we will be able to offer more, richer functionality and world-class levels of support. The release of version 7.17 of our managed database service will include support for additional functionality, including things like Role Based Access Control…

2 min read

Connected products at the edge

6 min read - There are many overlapping business usage scenarios involving both the disciplines of the Internet of Things (IoT) and edge computing. But there is one very practical and promising use case that has been commonly deployed without many people thinking about it: connected products. This use case involves devices and equipment embedded with sensors, software and connectivity that exchange data with other products, operators or environments in real-time. In this blog post, we will look at the frequently overlooked phenomenon of…

6 min read

SRG Technology drives global software services with IBM Cloud VPC under the hood

4 min read - Headquartered in Ft. Lauderdale, Florida, SRG Technology LLC. (SRGT) is a software development company supporting the education, healthcare and travel industries. Their team creates data systems that deliver the right data in real time to customers around the globe. Whether those customers are medical offices and hospitals, schools or school districts, government agencies, or individual small businesses, SRGT addresses a wide spectrum of software services and technology needs with round-the-clock innovative thinking and fresh approaches to modern data problems. The…

4 min read