Hex Tiles – The Data Tiling System Built for Analysis
This episode is about a unique system for map tiling called Hex Tiles. The guest speaker is Shan He, one of the creators behind Hex Tiles. She is also known for creating Kepler GL, an open-source browser-based geospatial visualization application, which she developed while working on Uber’s visualization team. Shan is currently a senior director of Engineering at Foursquare, leading the Geospatial platform Unfolded.
Why Do We Need to Tile Data?
When handling a large amount of data, it is not feasible to load all of it into a browser at once.
Tiling systems were created as a way of abstracting data so that only the data that is needed in the client side is loaded.
For instance, the data is loaded only for the area in a map where viewers are currently looking, or only pulling data for a specific extent into a geoprocessing tool.
For a map of the Earth, not every single feature on the ground will be loaded into the browser, but only a high-level geometry for the area that a viewer is interested in will be loaded.
For example, at the full scale of the United States, only interstate highways may be visible in a streets dataset. At the state-wide scale, state highways would also be visible. Tiling data is a way to effectively encode large scale geospatial data, and break it down into different, hierarchical pieces (tiles).
Raster and Vector Tiling Systems
Raster tiles were the earliest version of modern tiles. They are image-based tiles that encode data into every single pixel, in the RGB (colour) channel of the image.
Raster tiles are effective and lightweight as they are usually sent as an image to the client side. All that is there to do is to stream the image into the browser.
A major downside of raster tiles is that they are less interactive than the source. Since raster tiles are spread between resolutions, blurs are encountered in the image when zooming between resolutions. It is hard to smoothly transition from a lower resolution into a higher resolution, often, there are periods of waiting for a new image to be loaded to clear up the blurs.
A vector tiling system was later developed to cover the gaps in raster tiles. Vector tiles encode data into vector forms. These tiles load the precise shape, and the geometry transition into higher resolution is much smoother than the traditional methods.
What Are Hex Tiles?
Hex Tiles are an analytics-focused geospatial tiling system that builds on top of H3. The base unit of a Hex tile is a hexagon shaped cell whose address is encoded in H3. The finest Hex tile cell size that can be created is Hex 15, which is about a half meter in radius.
Computing analysis within Hex Tiles, especially for analysis that benefit from aggregated geometries, is a lot more efficient.
As opposed to grid cells, traversing in hexagon-shaped cells is a lot more consistent because the distance from one hexagon to its six neighbours are the same.
In a grid system, the distance from a square to the four neighbours at its edges is different from the distance between it and its four neighbours on its vertices. This makes traverse calculations a lot more complicated because when performing tasks like calculating distance or smoothing, in this scenario, there are four different neighbours that should be treated differently.
This affects cost distance and cost accumulation workflows, which are significant in hydrology and other geographic modelling use cases.
How to Create Hex Tiles
Hex Tiles can be created from raw data in two ways. One way is by uploading the raw data to Unfolded’s cloud, and using a Hex operation to specify how the aggregation should be done, kicking off a pipeline to build a hex tile in the Unfolded studio data portal.
The other way of creating Hex Tiles is using the Unfolded Data SDK. This is a Python SDK that can be installed into a data science workbench, and used to post data into Unfolded’s cloud. After associating the latitude and longitude location of the data, and specifying how the aggregation should be done, kicking off the pipeline will build a Hex Tile in the client’s Unfolded data portal.
Why Hex Tiles Are Better for Analytical Tasks
Even though raster and vector tiles encode precise geometries, they are a little cumbersome to work with in analytical tasks where the interest is on aggregated attributes of the geometry, rather than individual geometries.
Think of wanting generalized data or analysis for a neighbourhood or block group, rather than a single address.
Working with raster and vector tiles we usually must first convert the geometry into a unified shape, calculate the average or sum of all the geometry, then join the result back to our original dataset. It is not possible to do quick analytics on raster or vector tiled data. Hex Tiles were designed to overcome this when the goal is quick and efficient statistical analysis.
Are Hex Tiles Open Source or Proprietary?
The Hex tiling system is not open source, rather the pipeline for building Hex Tiles is proprietary. Right now, Hex Tiles can only be created in the Foursquare Unfolded pipeline. We may see it being open sourced in the future when the market finds more value in it; much like how Kepler GL started out proprietary before being rolled out as open source.
Who Are Hex Tiles Built For?
Hex Tiles are built for statistical analysis. They are generally used by people in data science, and those that handle large scale geospatial data. It is best for statistical analysis that is more interested in aggregated features, rather than the actual shape of each individual geospatial feature.
Hex Tiles are also built for analysis based on temporal data. Nowadays, almost all location data has a timestamp associated to it. Considering this aspect, we would certainly want to do analytics based on change that has happened over a certain period.
Hex Tiles are designed to support time since it is possible to encode tabular data formats.
It has the capability to not only encode average monthly temperature, but as well day to day, hour by hour, or minute by minute temperatures, which can be played back in the client side. This makes Hex Tiles functionally similar to a multidimensional data format.
Where Can Hex Tile Formats Be Used?
In order to consume hex data, a special analytical visual layer is required. Right now, it can only be consumed in Unfolded studio – a client-side application built on top of Kepler GL. Once the Hex data is built, a special visual layer is not required anymore. The Hex Tile data can be loaded into any kind of data science workbench without having to visualise it again.
The Future of Hex Tiles
The future with Hex Tiles could look like a world where the administrative boundary is broken down using a unified degree system; for any kind of geospatial analysis. There are more and more types of geospatial data that have been developed, and there is a need to find a way to unify all of them in order to unleash the full power of geospatial – which is being able to look at data across different data sources and boundaries, and quickly zooming in from low to high resolutions. The invention of Hex Tiles bring this ideal world a little closer to reality.
Contact Foursquare: connect.foursquare.com/mapscaping
Introducing Hex Tiles: https://foursquare.com/article/introducing-hex-tiles-our-next-gen-tiling-system/