GRASS GIS probably doesn’t get the attention it deserves
Markus Neteler is the founder of Mundialis.de. He is also the Chairman of the GRASS GIS Project Steering Committee. GRASS is a free and open source geospatial software with a well-documented history and plenty of functionality.
Markus has been using GRASS since the 90s. First, as a student for academic and research purposes, and then as a developer monitoring infectious disease and processing satellite data.
WHAT IS GRASS?
GRASS stands for Geographic Resources Analysis Support System, and it’s a founding member of the Open Source Geospatial Foundation (OSGeo).
It’s an open source software that’s been around for over 35 years. The US Army developed it originally and in the early 90s, academia took over the development. Today it’s a global team on GitHub that keeps it up to date with the geospatial world—they integrate new formats, ideas, and algorithms all the time.
The project is open to anybody and businesses use it as a back end tool for analysis.
It’s a powerful raster vector and geospatial processing engine—suited for all kinds of analysis. It’s a toolbox for geospatial topics. Works on desktop, laptop, as a standalone app, or with GIS software.
It’s used as a back end in many projects in QGIS, where you have access to much of the GRASS functionality.
TELL US ABOUT THE UI
The GRASS database is a folder/directory on your desktop or network drive. It’s there for you to organize and store your data.
Most users do this by projects. These are called Locations in GRASS, a legacy name from the good old days.
Locations (or Projects) can refer to only one geographic place or area of interest, but it can be as large as the globe.
Or Mars. Mars data is also available.
It’s here where all your data is stored, and you can have one coordinate reference system for each. The map set(s) help you organize your data even further.
ISN’T THIS LIMITING?
The system of Locations keeps data clean and avoids mixing. If you need to reproject data you have to consider precision and the method of resampling the raster data, among many other things.
Suppose your data comes from various places and arrives in different coordinate reference system projections. In that case, you can reproject it on the fly.
When you open up GRASS, the first thing you’re asked is to choose what Location and Map Set you want to work on.
Coming soon, in GRASS 8.0, we’ll do away with this opening screen and you’ll start as you would with most other GIS—a menu.
WHAT’S THE DATA LIKE IN GRASS?
Because it’s existed since the 80s, GRASS has its own data format. It’s never been standardized or widely exposed at an international level, like GeoTIFF, GeoPackage or shapefile, but it’s similar to those.
Why keep this oldie?
Every format comes with limitations, and this format—a better representation in many regards—minimizes the loss when storing data.
Whatever format you bring to GRASS, it will be reprojected on the fly, and you’re ready to work.
WHAT IS THE VECTOR TOPOLOGY ENGINE OF GRASS?
Let’s start with what topology is for vector data.
Imagine two adjacent parcels—land use parcels, agricultural fields, that sort of thing. They’re next to each other, and they share one border.
In the Simple Features format, you digitize the two lines that are the borders between fields. Each parcel has its own complete border.
In a topological model, a shared line is one, not two.
What’s the big deal?
When you digitize your data in a non-topological model, the lines rarely match. You need to be precise when you set your lines and vector points on one another. This doesn’t always happen, and there’s a lot of data out there, such as official sites or administrative boundaries, that hasn’t been appropriately digitized.
Topological errors, overlaps, and slivers are a big problem, especially with political, relevant, or cadastral data. Using the topological model and only one line, you reduce the occurrence of errors.
When you import data into GRASS with topological errors, the engine alerts you to the error. It prompts you to either fix it or let the engine fix it with a snapping engine that’s included to remove duplicate lines. You clean your data while importing it.
As a bonus, there’s a possibility in GRASS to store 3D data, inclined vector lines, which could be relevant to many questions later.
Also coming soon, 3D corpus can be stored with topology.
AM I BUILDING A DATA MODEL WHILE CALCULATING THE TOPOLOGICAL ATTRIBUTES FOR EACH FEATURE AS THEY COME IN?
In a Simple Features model, you don’t know the neighbor because you are a self-contained polygon, an area of closed line.
Is there a neighbor at all? What is it?
No idea.
In a topological model, you know about your neighbors because of the common border between the two adjacent polygons.
When you import the data, GRASS looks at it as a ring, and it follows the ring unless it reaches the starting point. GRASS knows what’s left and what’s right, and it looks at a closed ring with the centroid inside.
The option to have multiple attribute tables lets you have essential information, such as the length of the shared borders or other geometry related information.
IS THIS ALL CALCULATED ON INPUT?
Yes.
It’s also immediately generated in the topological format while inside GRASS—digitizing data, vectorizing raster data, or creating a data task.
WHERE CAN I EXPORT MY MODEL TO?
We rely on GDAL/OGR for connecting vector data to the external world.
OGR does the job of translating the topological model back into Simple Features through which you can write out in any format you can imagine—shapefile, GeoPackage, and the list goes on. OGR supports them all.
WHAT ABOUT RASTER DATA?
For raster data, you import a grid or metrics that come in a GeoTIFF or HDF format. GeoTIFF has its limitations; you cannot attach color tables to floating-point maps.
GRASS has its own 2D raster engine with no such limitations. In fact, it’s a multi-layer engine—you can stack multi-band aerial or satellite imagery or time series from Landsat or Sentinel.
Suppose you look at temperature data coming from climatic stations. You have a stack of raster maps and they refer to daily temperatures in each pixel. In that case, you can quickly compute aggregates and determine average or minimum monthly or annual temperatures.
GRASS’s voxel engine is raster 3D. You can store volumetric xyz pixels—perfect for storing soil profiles, atmospheric profiles, or even medical scans. You’re not limited to environmental data; you can import brain scans, reconstruct a 3D volumetric model, and visualize it.
WHAT CAN YOU DO WITH 2D AND 3D?
For raster and vector 2D data, there’s an impressive application set.
For volumetric raster and 3D vector data, the application set is limited because of its complexity.
For example, you can compute the instant influx of energy on the surface with GRASS tools for solar radiation.
If you have an elevation model, you can compute how much energy will reach it, and normalize or correct clouds with available cloud cover data.
If you have a surface model, featuring houses, you can include those in the inclination of rules. Thus, it’s not a pure 3D application—it’s more 2.5D. It’s a surface with houses and objects on it. Surface flow, hydrological and flood modeling can include buildings too.
Groundwater flow modeling is done by the true 3D raster engine.
WHAT’S THE GRASS ECOSYSTEM LIKE?
It’s a friendly one.
To start off, rather than importing raster data, which is easy to accumulate, you can register the data to avoid duplication of disk space consumption.
How?
Tell GRASS that you have a file, perhaps a GeoTIFF. It registers the file as a link and considers it a true raster map being imported—but it is not. It is sitting somewhere else, and thus the speed penalty is low.
Of course, you can save the file in GRASS. That might be the case when, for analytical reasons, you want to retrieve data into your own database from the remote OGC service.
You can read data from WMS (for raster data) or WFS (for vector data and web coverage service). You can show such data in the display as a backdrop map.
GRASS talks to PostGIS. The OGR interface translates between the topological GRASS engine and other non-topological engines and databases out there.
WHAT ABOUT PROGRAMMING INTERFACES TO GRASS?
A third of GRASS is written in Python. It’s a key part of the system.
Several interfaces are available. You can write your Python script, import the GRASS Library Session on the pip server, and use the functionality straight away. It’s handy.
Suppose you want to develop additional functionality or combine existing tools. In that case, you can write your GRASS script with Python, Shell, C, C++, Octave, or even PHP.
Virtually no limitations.
WHAT DOES THE PARSER DO?
When you say,
“Execute this functionality and use this input map,”
you enter a command into the terminal and the system has to read it.
What’s doing the communication?
It’s done by a parser.
GRASS comes with its own powerful parser that understands parameters and distinguished flags. It doesn’t mind mixing the order. For any parameter, it prompts you to specify if it’s mandatory or if it can put default values where it’s not needed. Usually, people go with the default value and the parser can do that.
Include the parser in whatever programming language you use. It will do an awful lot more than just understanding what the user wants. It can generate output and the description of the script.
If you want to do something that’s already out there without writing the thing yourself, you can call up an existing script with the parameter. It will output a script in R or Python. Copy and paste from there and continue to extend with your own functionality.
Actinia is a software available on GitHub. When you generate process chains and use them in the cloud, you don’t need to write everything yourself. You can generate JSON style, it will output the script you wrote in Shell or Python in JSON, and you can drop it to the cloud.
The description of how the button should look and where it should be is also done inside the parser.
The UI partially exists, but a lot of it, like all the buttons, is generated on the fly.
CAN I TELL THE COMMAND LINE TO GIVE ME “A” IN PYTHON, “B” IN JSON, OR EXPORT “C” AS SOMETHING ELSE?
Yes.
During your first demo session in GRASS, you get a button and a choice to see a road map, elevation model, and others.
You click on a button in a menu that’s displayed in a dialogue box. You also see a copy button, which exposes the command line.
The graphical user interface is also using the command line internally, only it’s hidden.
You could copy over this command and put it into the terminal and run it. It will do exactly the same thing. It’s a Shell Command. If you then want to see it Python style, it will write out this command in Python, and you copy it over to the Python script. Same for JSON, XML, HTML and so on.
With the graphical model in the user interface, you can graphically combine commands and create your workflow. Import data, generate a report, store that as a plot or a CSV file, or export to Python.
You can now write your first-ever Python script. And this can then be re-run, as a standard command.
IS THERE A WAY TO VISUALIZE DATA IN GRASS?
Yes. GRASS Monitors do that.
As with other GIS software, it’s a graphical output interface. Buttons, legend, map, styling, and so forth. The display is also like any other GIS software; there’s a 3D viewer to display your data or volumetric data.
If you digitize with a digitizer, you can check on the fly—it will flag if your data is topologically correct or not. Red being incorrect, green being correct. You can query data, plot time series, or do all kinds of plotting.
If you’d prefer to view your data elsewhere, you export to ParaView, which is a powerful open source visualization tool. You can use the interface to R or QGIS or run it in Jupyter Notebook. Connect to all the plotting tools such as Matplotlib, Octave, and so on. Or do the visualization right away in GRASS.
Earlier this year, the NASADEM—the reprocessing of the SRTM elevation model—was completed. It’s about 250 gigabytes of data.
Importing it into GRASS takes a bit of time and also depends on your hardware’s speed.
But then you get the entire map in mere seconds in the GRASS display.
How does it happen so fast?
GRASS has an on-the-fly reduction tool, also known as lazy computation in other systems. You can reduce the resolution for the follow-up computation and try something quickly on a huge data set without waiting around forever.
When you’re sure everything’s in order and you will get the results you expect, you set it back to the original data set resolution, and you can retrieve the result.
The display is also reduced on the fly. If you have a monitor with a limited number of pixels, there is no point in displaying 30-meter pixels of a global net. You won’t see the difference.
That’s how it’s fast.
WHO IS GRASS FOR?
It’s for data scientists who want to analyze vast amounts of data that may come from heterogeneous sources. GRASS funnels the data into one system and does the cleanup at the same time. It’s interoperable and you can bring your data into different final visualization systems.
GRASS is not limited to academia. It’s for anyone who needs simple to complex geospatial processing in whatever format or data model—raster, vector, 2D, 3D, and time series. A wide variety of businesses are using GRASS for data extraction, and provision of conclusions or analysis on top of extensive data.
It’s a reliable tool for high-level abstraction for topical applications; fire simulation, solar energy, hydrology, geology, etc. There’s no need to reinvent the wheel—just use the tool that’s been tested in many environments and optimized for speed.
WHERE IS GRASS HEADING IN THE NEXT FIVE YEARS?
It’s well settled in the open source sphere and it’s staying there.
It’s an excellent tool to draw conclusions from data. Because of its modular architecture and flexibility, it’s ideal as a back end solution.
As people realize the spatial components of their problems, they’ll need software to address that.
We will continue to develop the graphical user interface and other interfaces in ways accessible to many.
GRASS has a ton of functionality. It’s free. It’s open source. You can download and start experimenting with it.