Getting Started with PgRouting
Some of the most common questions to answer when opening a map are probably “Where are we?”, “How do we get from point A to point B?”, “Should we take I-84 or I-95?”, “Which trail will give us the best views on the hike up?”. These questions can be answered by charting the best route based on mapped paths.
Using paper maps, finding the best route is a manual process, as well as using word-of-mouth advice (“You’re going to take the Bronx expressway at rush hour? You’re crazy!”). Nowadays, mapping apps use routing engines to tell us the best ways to get from point A to point B.
We can also do this on our own using open-source software. In this guide we will explore pgRouting, an open-source routing engine that enables least-cost-path analysis and calculating routes based on OpenStreetMap data.
This is a sequel to “Getting Started with PostGIS”. It is highly recommended that you follow this guide to install pgAdmin 4 with PostGIS, and to load OpenStreetMap data into PostgreSQL before continuing here.
- Installation of pgAdmin 4 and Postgres with the PostGIS spatial extension
- OpenStreetMap data loaded into a PostgreSQL database using “osm2pgsql”
Quick Introduction to Graph Data Structure and Routing
Before getting into the practical applications, let’s cover the basics of graph data structures. Graph data structures are designed to model movement along a set of paths. This has been widely used for routing in public transit and navigation but has branched out beyond to fields like biomedical research, social media, and machine learning. At its core, a graph data structure is composed of two basic units, nodes, and edges.
- Nodes are points that represent junctions connecting one or more possible paths through a routing network. Think of an intersection that connects multiple segments. A node is where we stop along a route and decide where to turn next. Nodes cannot stand alone and must connect to one or more edges.
- An edge is a line segment that connects two or more nodes, and represents the distance between them. Edges can be directed (movement in one direction) or undirected (movement in any direction), and must have explicit start and end nodes. The “distance” along edges is represented by attribute values that reflect the cost to travel from node to node across a given edge.
- PgRouting contains several least-cost path algorithms for routing. As the name suggests, these algorithms find a path along a graph network from point A to point B that minimizes the total cost. Cost is an attribute value that describes some property of each edge and/or node in a graph. Cost can be any numerical parameter, the most common is travel distance, followed by travel time. The cost could also reflect the slope of each edge or a rating of the scenic views along an edge.
- A Routing Algorithm is a set of rules for determining the best path.
Diagram showing graph topology of edges and nodes showing a least-cost path based on total aggregate cost.
Install PgRouting & OSM tool
Now that we have a basic understanding of routing through a graph data structure, let’s take a look at an example. PgRouting comes included in recent PostGIS distributions. If you installed your environment following this guide, you should be good to go.
Open pgAdmin 4 and highlight the database containing the OSM data loaded in the last PostGIS guide. Open the SQL window from the browser toolbar( ).
Enable the PostGIS and pgRouting extensions with the following commands.
CREATE EXTENSION postgis;
CREATE EXTENSION pgrouting;
Check to make sure they are installed and enabled correctly.
SELECT * FROM pgr_version();
Preprocess OSM Data
We already have an OSM dataset from the city of Lisbon, Portugal, imported into a spatial database using “osm2pgsql”. OSM data contains sets of points and lines, but these are not ready to run with pgRouting in their raw form. In this section, we will extract a subset of our OSM data, and preprocess it to have the proper topology for routing with “osm2po”.
The “osm2po” tool ingests raw OSM data and outputs a SQL file which can be imported into a spatial database and run with pgRouting.
We need to adjust some configurations before running the tool. Open the “osm2po.config” file and make the following edits:
- Change line 190 to “wtr.finalMask = car,foot,bike”. This will filter the OSM data to segments with these tags.
- Uncomment (remove the “#” at the beginning) of line 341 “postp.0.class = de.cm.osm2po.plugins.postp.PgRoutingWriter”. This will have “osm2po” output topology in a format that we can read with pgRouting.
Note: The edits here are based on “osm2po.config” version 5.5.5. Line numbers and config parameters may be different for other versions.
Save the config file and execute “osm2po” from the source directory in the command line using the following syntax:
java -jar <osm2po jar file> cmd=c prefix=<results folder> <OSM .pbf filepath>
This is the command used on Windows Subsystem for Linux:
java -Xmx512m -jar osm2po-core-5.5.5-signed.jar cmd=c prefix=lisbon /mnt/c/osm2pgsql_guide/Lisbon.pbf
Once the job is complete, the results will be saved in the folder titled “lisbon” (or whatever you defined as the prefix). We will now import the output .sql file into Postgres, and run pgRouting.
Load the “osm2po” .sql file into the PostgreSQL database containing the OSM data imported in the last tutorial with the following.
psql -host localhost -port 5432 -U postgres -d postgres -q -f lisbon_2po_4pgr.sql
Note: if you have trouble running “psql”, try updating your environment variables
You should see a new table added to the database in pgAdmin
Looking at the contents of the Lisbon_2po_4pgr table, we see that each record has attributes for a source node, a target node, and an integer classification. Additionally, each record has an associated cost that will be used to find the least-cost path. By default, the cost is travel time based on the distance (km) and speed (kmh) defined in the “osm2po.config” file.
Contents of the routing table output by “osm2po”
Finally, let’s try out pgRouting! We will utilize the Dijkstra shortest path algorithm included in pgRouting. The required parameters for the routing function are the routing table, a starting node, and an ending node.
pgr_dijkstra(Edges SQL, start_vid, end_vid)
The code below will take input latitudes and longitudes for the starting and ending points, and find the closest node in the routing table. For this example, we will find a route from the historic Castelo de Sao Jorge to the Parque Eduardo VII in the center of Lisbon. You can easily get coordinates in Google Maps by right-clicking. The copied coordinates that roughly match the points in the Google Maps route can be found below.
Google Maps walking route from Castelo de Sao Jorge to Parque Eduardo VII.
— find the nearest node to the start point
WITH start AS (
FROM lisbon_2po_4pgr as topo
ORDER BY topo.geom_way <-> ST_SetSRID(
ST_GeomFromText(‘POINT (-9.1333334 38.7133289)’), 4326)
— find the nearest node to the endpoint
destination AS (
FROM lisbon_2po_4pgr as topo
ORDER BY topo.geom_way <-> ST_SetSRID(
ST_GeomFromText(‘POINT (-9.1545999491376 38.73023147131511)’), 4326)
— use Dijsktra least-cost-path algorithm and join with the geometries
SELECT ST_Union(geom_way) as route
ST_Length(ST_Transform(geom_way, 3857)) AS cost
array(SELECT source FROM start),
array(SELECT source FROM destination),
directed := false) AS di
JOIN lisbon_2po_4pgr AS pt
ON di.edge = pt.id;
Note: code modified from this guide.
The output of this script should be a single row of binary geometry data.
Highlight the first row, and click the map icon in the top left of the results to show the geometry on a map with the pgAdmin Geometry Viewer.
Route from Castelo de Sao Jorge to Parque Eduardo VII from pgRouting.
There’s the route! In this guide, we have covered the basics of graph data structures, how they are used in least-cost-path routing, how to preprocess OpenStreepMap data for routing analysis using “osm2po”, and how to run routing algorithms using “pgRouting”. Try this out for yourself with some other points and explore the dataset further.
This guide has only touched the surface of least-cost-path routing. Many parameters can be adjusted to match any specific use case. For example, the cost value here is based on time, but you could adjust the cost value to prioritize scenic view, gradient, walkability, bike safety, etc.
If you enjoyed this guide, share any of your results with us on Twitter @MapScaping!
Happy routing 😊