Navigating Geospatial Open Source: A Guide to an OGC Stack
As a technical discipline that is extremely reliant on high resolution imagery, powerful software, and enormous amounts of robust data, the geospatial realm appears to be a field with a high barrier to entry, and an expensive price tag. The commercial satellite imagery upon which geospatial analysis relies can be challenging to acquire. Adjacent industries like engineering and graphic design are also infamous for their pricey software.
GIS is equipped with a unique advantage: a thriving open source community. The online forums, the various boards and committees, the shareable servers and storage mechanisms, and the crowd-sourced and community-vetted databases are all critical for the evolution of the technology itself, as well as the considerable amount of publicly available data and resources.
With such a vast variety of open source technology at the fingertips of every user, where do we begin? How can we know what to trust, and how to leverage it together with data and software we already have?
GIS Data and Software Standards
This is where the Open Geospatial Consortium (OGC) standards come in. OGC is a global organization, made up of hundreds of members who are all committed to developing and adhering to standards for geospatial data definitions, formatting, sharing, and storage. Unlike other governing bodies, OGC doesn’t take a top-down approach to regulation, rather, everyone has the chance to be involved.
“The implementers are with you in the room…it’s very, very community-engaged,” says Dr. Nadine Alameh, CEO of OGC. “That’s how standards are evolving.”
The pace of this data revolution is too fast for large companies like Apple, Google, and Microsoft to develop standards for rapidly evolving formats. “Nobody can do it alone,” says Dr. Alameh. “Even Google needs other people’s data.”
That data needs to be hosted somewhere. For such a wide audience, full of diverse users with different needs, it is crucial to be able to access and share data universally.
Relational Databases
Traditionally, GIS users would house their data in nesting of folders – and as founder of Mammoth Geospatial, John Bryant, says, there’s no shame in that. Shapefiles and other traditional spatial data formats have their limitations: structurally, spatially, and computationally. The alternative is using a relational database management system like PostGreSQL, and its spatial extension, PostGIS.
First and foremost, relational databases allow for multiple users to edit simultaneously, saving time and allowing for better collaboration and adherence to OGC standards. Plus, leveraging Structured Query Language (SQL), users can extract meaningful insights from data that may otherwise be too large to make sense of.
Use PostGIS to explore and summarize large sums of geospatial data, combine and compare attributes, and develop highly advanced “views” to be visualized in desktop GIS software or even shared with non-GIS users.
Web Services
Building on top of relational databases, web server packages are the next step in sharing and accessing data.
GeoServer is an open source Java product that allows users to publish data to the web for public or moderated access. “It’s a platform if you want, but it’s also a product,” says Simone Giannecchini, founder and director of GeoSolutions, the company that runs and maintains GeoServer.
In terms of inputs, GeoServer supports PostgreSQL and Oracle, as well as Shapefiles, geopackages, and different types of databases. “The number of plugins that GeoServer makes available, and the number of formats that it can serve, is actually pretty extensive,” says Giannecchini.
GeoNode, another open source platform and content management system, is also a key component of GeoSolutions. Built on top of a GeoServer instance and a PostGIS database, it enables users to search data, access the data via an API (i.e., WMS, WFS, etc.), and add data to their maps; think of it as an open source version of ArcGIS Online. Since GeoNode is integrated with MapStore, users can visualize and edit data online, and it can be shared and validated by other users, or groups of users.
Considering that GeoServer and GeoNode integrate so many data formats, it is important to pre-process the data in a software like QGIS before it’s shared with the world, in order to make sure it functionally aligns with the OGC standards for that data type.
Desktop Software
Chances are you’re familiar with QGIS, an open-source geospatial desktop application, available across practically every operating system. It can be used to generate, modify, visualize, and analyze geospatial data, and it is compatible with nearly all spatial data formats.
In addition to allowing users to work with local data, QGIS is equipped with a simple tool for connecting to data servers (via WMS, WCS, WFS, etc.); for example, once the data you’re using is published through GeoServer, it is just a matter of inputting the link to that instance, and the data will immediately be accessible.
In the example below, a dataset published via GeoServer is being displayed in QGIS. By right-clicking on the WMS/WMTS option in the Browser panel, the user can add a connection to any online database; displayed here is the U.S. National Vegetation Classification, which would be far too large to download and use locally. GeoServer and the WMS feature of QGIS enable the functionality required to conduct analysis without taking up valuable hard drive space, and RAM.
There is really no shortage of projects that can be tackled using open source GIS; from finding your data via GeoNode, publishing it via GeoServer, extracting valuable statistics in PostGIS, and conducting analysis in QGIS, all the open source tools work hand-in-hand, thanks to the oversight and organization of OGC standards.
If you’re interested in taking any of these tools for a test drive before downloading them, OSGeoLive, a project developed by the geospatial non-profit OSGeo Foundation, presents an opportunity to do so. The self-contained program is available via DVD, USB drive, or VM and includes all the aforementioned software, systems, and services for you to explore and experiment within an easy to implement, self contained package.