Getting Started with PostGIS
SQL (Structured Query Language), usually pronounced “see-quel”, is the standard language for interacting with relational databases, and many cloud data warehouses including Databricks, Snowflake, AWS Redshift, and Google BigQuery.
Knowing SQL means that you can analyze data stored in tables and work cross functionally across all of the major cloud data warehouses – a very marketable skill for potential employers!
In this guide we will be using PostgreSQL, an open-source object-relational database system, to load data from OpenStreetMap (OSM).
Learning SQL takes some time and effort, but if you’re interested in working with any data contained in tables, putting in the time to learn some SQL will benefit you tenfold. The basic syntax of a SQL query is something like this:
SELECT * FROM dinner_guests;
Translation: Select all (* means all in SQL speak) columns (fields) from a table of invited dinner guests.
You can build upon this basic syntax with one of several standard SQL commands like WHERE, CREATE, UPDATE, DELETE, or INSERT. This tutorial assumes you have a basic grasp of SQL query syntax. The most commonly used command to filter data is the WHERE clause.
SELECT name, entree, dessert FROM dinner_guests WHERE rsvp=‘yes’;
Translation: Select the name, entrée choice, and dessert order fields from a table of invited dinner guests for only invitees that RSVP’d yes.
There are many extensions that add additional functionality to SQL. We will get started with PostGIS, a PostgreSQL extension that enables us to use Spatial SQL queries.
What is PostGIS?
PostGIS is an extension to the core functionality of PostgreSQL that adds functions to work with spatial geographies and geometries like points, lines, and polygons.
Install PostgreSQL and PgAdmin4
Download and install the PostgreSQL installer GUI. To do this, navigate to this page and download the installer for the latest version of PostgreSQL on your operating system of choice. For reference, we have downloaded version 15.0 on a Windows OS. Installation steps on other operating systems may differ slightly.
Run and follow the installer steps (see this for a full guide). While running the installer, make sure to note the listening port (5432 by default). On the “Components” page, make sure that all of the boxes are checked. This will install:
- pgAdmin – a GUI for working with databases and executing SQL, and
- Stack Builder – a GUI for installing PostgreSQL extensions.
Follow the installer and keep the default configuration. Before clicking “finish” in the setup wizard, check the box to launch Stack Builder at exit. We will use Stack Builder to install spatial extensions like PostGIS.
Install the PostGIS Extension
Once you close the PostgreSQL installer, Stack Builder will open. Follow the GUI prompts until you reach the page with the prompt “Please select the applications you would like to install”. Under “Spatial Extensions” select “PostGIS” then click “Next”.
Wait for the extensions to be downloaded then click “Next” to install. On the following screen, check the box to create a spatial database, then click “Next >”.
Follow the GUI, and enter your password when prompted. You have successfully installed PostgreSQL, pgAdmin, and PostGIS. You’re doing great! Now that everything is set up, let’s take PostGIS for a test drive.
Open pgAdmin. If this is the first time you open pgAdmin, you’ll be prompted to set a new password. Choose something easy to enter. You’ll have to do it often. We chose “postgres”.
The localhost of PostgreSQL running on port 5432 should be set up and available in the sidebar of pgAdmin.with the empty spatial database that Stack Builder created for us.
Let’s try some basic statements then move onto spatial operations. PostGIS has two core datatypes – Geographies and Geometries.
- Geographies represent data on a spheroid and store points, lines, and polygons using angular units of latitude and longitude. This is great because you don’t have to worry about projections at all. However, some processing works better when using units of distance rather than angles.
- Geometries represent points, lines, and polygons on a flat plane using one of many projected coordinate systems. Projected coordinate systems distort shapes and areas. Be sure you use an appropriate coordinate system for your area of analysis. See this article for more information on using projections with PostGIS.
Take a look at this link if you’re deciding whether to use geometries or geographies.
In pgAdmin Browser, select the database, then navigate to the SQL query tab in the upper left-hand corner .
Make sure you have the PostGIS extension installed by running this command:
CREATE EXTENSION postgis;
Click the execute button to run the command.
Let’s test PostGIS by creating a new table and populating it with four point geometries. We can then run a query using a PostGIS Spatial Type (ST) function, “ST_Intersects”, to find the points that lie within a specified polygon geometry.
CREATE TABLE test_points (name varchar, geom geometry);
INSERT INTO test_points VALUES
(‘one-one’, ‘POINT(1 1)’),
(‘two-two’, ‘POINT(2 2)’),
(‘three-three’, ‘POINT(3 3)’),
(‘ten-ten’, ‘POINT(10 10)’);
SELECT name, ST_AsText(geom) FROM test_points WHERE ST_Intersects(geom, ‘POLYGON((0 0, 0 5, 5 5, 5 0, 0 0))’)
The code above creates a new table with fields for name and a geometry, adds four records to the table, then runs the “ST_Intersects” spatial query. The table returned contains three records with coordinates that lie within the specified polygon (0<x<5 and 0<y<5). The point outside of the polygon (10, 10) is not included.
You’ll also see the new table you have created under “Schemas” in the browser tab. See this page for a full list of all ST functions included in PostGIS.
If you have this working, great! Next let’s load some real-life data.
Load Data from OpenStreetMap
OSM is a database of open crowdsourced data for streets, buildings, railroads, walking paths, you name it. This section will load data from OSM to a PostgreSQL database using the osm2pgsql command.
Loading OSM using osm2pgsql requires an additional install. You can download osm2pgsql on your OS of choice here. We are using Windows – so we install the latest binaries by unzipping it to the C drive (or anywhere that it will not be moved/deleted).
Before using the tool on Windows we will need to add it to the Path Environment Variable. In the Windows search bar, look for “Edit the system environment variables”. Add the path to your osm2pgsql-bin folder (the directory that contains osm2pgsql.exe) to the Path system variable.
Test your installation by opening the command prompt and entering “osm2pgsql”. The response should be something like below.
If you see something similar to above, osm2pgsql is installed correctly. The next step is to use that command to load OSM data. We will use OSM data for the city of Lisbon, Portugal. A .pdf (compressed .osm file) is available here. You can also download data from a city of your choice here. You should also download the default OSM style file here.
Move these to a directory of your choosing and run the following line in the command prompt (where we tested osm2pqsql earlier). Replace the path parameters to the locations of your style and pbf files, respectively.
osm2pgsql -c -d postgres -U postgres -H localhost -r pbf -W -S C:\osm2pgsql_guide\default.style C:\osm2pgsql_guide\Lisbon.osm.pbf
You’ll be prompted for your password (we used “postgres”). You should see a return like this.
Open pgAdmin. You should see the tables listed above in your postgres database.
We can query this data with something like this:
SELECT name, highway, surface FROM planet_osm_roads
We can also visualize these data with QGIS. Make a new QGIS project. Under the Browser, right click on “PostgreSQL” and select “New Connection”
Enter a connection name and set the database to “postgres”. After clicking “OK” it will prompt you for your PostgreSQL username and password.
Now the OSM data is available in your QGIS browser.
The image below shows the OSM roads (red) and OSM lines (green) features.
Let’s zoom into the Castelo Sao Jorge, a historic district of the city, to see the details of OSM data.
We have covered how to install PostgreSQL with PostGIS, how to create a spatial database, how to run a spatial query, how to load in data from OpenStreetMap, and how to visualize those data in QGIS. Please let us know if you found this tutorial helpful by tweeting @mapscaping! In a future post we will expand upon this guide by creating a road network using the pgRouting tool so stay tuned.
Happy mapping 😊