Spatial clustering with PostGIS
Here are a few examples of how you can use PostGIS to perform spatial clustering:
DBSCAN clustering
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is an algorithm that clusters points based on density. Points in a high-density region are considered to be part of the same cluster, while points in a low-density region are considered to be noise (i.e., not part of any cluster).
The ST_ClusterDBSCAN
function in PostGIS takes two parameters: eps
and minpoints
. eps
is the maximum distance that points in a cluster can be from each other, and minpoints
is the minimum number of points required to form a cluster.
For example, in the following query:
SELECT ST_ClusterDBSCAN(geom, eps := 0.1, minpoints := 3) as cluster_id
FROM points
GROUP BY cluster_id;
points that are within 0.1 units of each other and that have at least 3 other points within 0.1 units of them will be considered part of the same cluster. Points that do not meet these criteria will be considered noise and will be assigned a NULL
value for cluster_id
.
K-means clustering:
K-means clustering is an algorithm that divides a set of points into a specified number of clusters based on the mean distance of points from the centroid (mean) of each cluster.
The ST_ClusterKMeans
function in PostGIS takes a single parameter: the number of clusters to create. For example, in the following query:
SELECT ST_ClusterKMeans(geom, 2) as cluster_id FROM points GROUP BY cluster_id;
the points in the points
table will be divided into 2 clusters based on their distance from the centroids of each cluster.
Voronoi diagram:
A Voronoi diagram is a partitioning of a plane into regions based on distance to a set of points. Each region represents the area of the plane that is closest to a particular point.
The ST_VoronoiDiagram
function in PostGIS takes a single parameter: a set of points. For example, in the following query:
SELECT ST_VoronoiDiagram(geom) as geom FROM points;
The ST_VoronoiDiagram
function will create a polygon for each point in the input set, with the polygon representing the region of the plane that is closest to that point. The resulting diagram will consist of a set of disjoint polygons, with each polygon representing the region closest to a point.
For example, suppose we have the following set of points:
POINT(0 0)
POINT(1 0)
POINT(1 1)
The Voronoi diagram created from these points would look like this:
POLYGON((0 0, 0.5 0.5, 1 0, 0 0)) POLYGON((1 0, 0.5 0.5, 1 1, 1 0)) POLYGON((1 1, 0.5 0.5, 1 0, 1 1))
Each of these polygons represents the region of the plane that is closest to one of the input points.
For more information on PostgreSQL and PostGIS check out these podcast episodes