Adler Santos is the Platform Engineering Manager at SkyWatch’s TerraStream, a data management and distribution platform for satellite operators. His goal is this: build a machine-learning algorithm to detect clouds.
Adler has a background in physics with a thesis about polarization optics, which ties in neatly with satellite imaging technology. His experience also translates to machine learning ̶ perfect for detecting clouds on solid images.
The biggest value you get from satellite images is that you can see what’s on the ground. If there is too much water vapor in front of the satellite imaging sensors, it obscures the Earth’s surface, and you might lose all value of the platform.
If you have cloudy images, you can’t see what you need to see.
So why not just look at each image, decide whether there are clouds in the image and if they are obscuring your area of interest?
Because this is a slow manual process, satellite imagery is being collected at an ever increasing velocity and if the industry is going to be able to keep the promise of real time analytics based on satellite images we need programmatic access to satellite images.
One of the first steps in programmatically accessing satellite imagery is to decide whether not clouds are covering the area of interest.
You need cloud detection for satellite imagery to ensure that your images are as cloud-free as possible. Only then can you go into extracting insights or apply analytics on the data.
Because it’s a matter of convenience, mainly.
Let’s say you need a satellite image of New York on a specific date at a given time. Before you make your order, you’ll be offered an image preview of what it’s going to look like. You have a chance to eyeball the image previews before placing your order to make sure it’s clear, and there are no clouds. If you’re happy with what you see, you can order.
This is a manual process.
What happens when you need hundreds or thousands of images regularly?
You need a convenient, effective, and programmatic way to determine whether those images are cloudy.
There are too many ways you can represent or misrepresent clouds on satellite images.
Cloud detection on satellite imagery is still an open problem. Many people are looking into it, but there’s no single way of detecting clouds at the moment. Everyone has their own approach, and every model has its own features to look for. Certain aspects of data are not even considered yet.
It’s an open problem, making it difficult to solve when you’re trying to build a model that spans a wide range of satellite sources or data.
Some satellites have bands that specifically detect clouds. These are called cirrus bands.
Why can’t we just use them? It’s because not all satellites have cirrus bands, and not all satellites can detect vapor.
We need an automated mechanism to detect clouds from all satellite sources, including those that don’t have cirrus bands.
RGB. Our ultimate cloud detection goal is to build a model that we can use across different satellite images from various sources, with the least number of bands.
Every satellite has its own distinct set. For example, Sentinel-2 has thirteen bands, Landsat 8 has around nine. There is always going to be the least common denominator in all of them.
We found that the RGB band is an excellent set that fits most of the satellite data. It’s what most people are familiar with, and we can apply analytics to it.
There are many object detection algorithms and pre-trained models out there running on some optical or RGB bands. Like the ones people use to detect faces, cats and dogs, houses, and other objects.
We can’t really use those models for satellite images.
You can download collections of pre-trained models from the internet and fine-tune them to your use case. They are openly available to everyone. But the fact is, they’re trained using data that’s not representative of the top-down view that every satellite sees. Those images are conventional, everyday stuff you capture with a smartphone or a camera.
Satellite images are a high resolution, top-down approach with a very different profile and presentation of the data.
I just want to detect tables on an image.
You can get a pre-trained model for conventional images from the internet to identify sharp edges, most likely to be a table.
For satellite images, sharp edges could translate to roads, crossings, rectangular-shaped buildings, or objects.
There’s quite a difference. The model from the internet doesn’t work for clouds, smoke, or snow. Or anything with a variety of shapes or sizes. You really need something satellite specific.
Gathering high-resolution satellite training data from multiple sources is not as easy as downloading millions of pictures of people, dogs, or cats.
They’re not openly or freely available on the internet. You need to source them internally, probably from partnerships or from integrations ̶ with a set of policies around them.
Can you use an image from a commercial provider? Can you use certain data for deep learning applications and analytics?
There’s a lot to consider when preparing the training data.
Part of our roadmap is to explore alternate datasets because so many factors need to be considered. We can’t rely on pixel data only.
Suppose you still want that satellite image of New York in three days.
You can wait until the day for the satellite to capture it and then run your algorithm on the pixel data for clouds.
Or you could check the weather forecast in the New York area in advance, which would be just as good of an indicator of cloudiness.
Other factors to consider could be things like altitude. Not all objects on the ground have the same altitude, and somewhere at a higher altitude could be more (or less) cloudy.
We need to explore as many factors as we can to ensure accuracy.
With accuracy, the challenge we face is not so much the model but how large the training data is.
The larger your training data gets, the longer it will take to train your model.
Performance is an issue. A single satellite image can go up to a few gigabytes in size. When you think about thousands of images, your training data can blow up to terabytes or petabytes and further scale. That could lead to an exceptionally long training time. There’s a cost element there.
More data means better accuracy, but the length of training you need to do also needs to be considered carefully before you can deploy.
For accuracy, it’s thevariety of resolution that has the most significant impact.
When you train a model on low resolution images, you’ll get remarkably high accuracy, but only because you feed that with low-resolution reference images. The same goes for high resolution.
Your model will be as accurate as you train it. You need a variety of resolutions. Going back to those pictures with faces, dogs, and cats ̶ some of them are taken from far away, some from close up, and some from an angle. For the best results, you need all sorts.
When we started building our initial version of the model, it took five to ten seconds per image. Not as performant as we expected it to be.
It was down to the semantic segmentation algorithm behind the scenes. We labeled every image on a per-pixel basis, which meant that if you had an image with clouds on it, it detected every pixel and told you if that was a cloud pixel or not.
Let’s apply that to a high-resolution image. It will take a while and a lot of CPUs and memory resources to label every pixel in a single image.
We’ve now moved to object detection, even if accuracy suffers slightly. Instead of labeling per pixel, we draw a box around the clouds.
If you draw a square around a cloud, it’s not representative of the cloud’s shape. In terms of performance, though, it’s good enough for us to use it.
We’re now clocking less than a second per image.
And so, we’ll also be rotating those bounding boxes. If there’s a diagonal cloud and you draw a bounding box, likely, much of that box is not clouds. We’re exploring rotating these bounding boxes in various angles; 30, 45, or 60 degrees, and see if we can fit clouds better using those rotated boxes.
Once we have our model, we can detect anything with a top-to-bottom view.
One of these models’ extensions is that it’ll be easy to detect other objects, not only clouds. Clouds are a bigger challenge than detecting other objects. Buildings, for example, on satellite imagery, are easier to find because they have well-defined shapes. They’re almost always rectangular. Clouds are dispersed with irregular shapes. If we can pull off a cloud detection model, it will be easier to build models that detect other objects, such as cars, planes, buildings, and so on.
At the moment, there isn’t an enormous demand for satellite imagery. Suppose you’re a real estate developer or a city planner. In that case, you can get by with weekly or monthly progress images of your construction site. When you need one and it’s cloudy, you’ll just try another day or week.
People will soon need hourly, and daily satellite images as the industry will need more frequent data. Will you wait another day or another week for your perfect cloud-free day? It’s inconvenient and you’ll be looking for alternative solutions.
Synthetic Aperture Radar (SAR) is a promising technology for satellite data and Earth observation.
SAR data is independent of weather and daylight conditions ̶ it can penetrate through clouds. It doesn’t discriminate between night or day, and that’s a win for everyone. Right now, we have to schedule a satellite passing through our areas of interest during daylight hours so we can have a clear capture. SAR increases the chance of acquiring clear and usable satellite images of the area.
As a bonus, SAR can detect and quantify motions of objects on both land and sea. This motion detection could work well with your algorithm, since you’ve already trained it to detect objects from the top down.
For now, optical imaging is vastly more prominent than radar imaging. For every 500 satellites that can offer optical imaging, there are only 50 satellites that offer standard radar imaging. Even less than that does SAR.
There’s still a long way to go for SAR, so in the meantime, cloud detection remains especially important.
Once we can detect clouds and fuse datasets together, some really big wins will happen.
Real estate and city planning. Economic and financial use cases. We’ll be able to measure oil tanks, roads. Count cars, airplanes, ships, and so on. There are humanitarian use cases, such as disaster response detecting wildfires like California’s recent one.
The number of satellites launched to perform Earth observation will grow exponentially in the next few years.
Materials and technologies are getting cheaper. The cost of launching a payload up to the atmosphere is getting cheaper. Instead of waiting for another image or another satellite to pass by your area of interest, we could have near real-time access to satellite data.
These speeds up the value chain and the value creation that we can get from Earth observation tremendously.
Hopefully, not everyone has to build their own models.
We’re getting there pretty fast. The performance or the accuracy of your model depends on the size of your training data. The number of satellites is growing, so we’ll witness growth in Earth observation data too. Eventually, we’ll have even faster and more accurate models.
All this may seem trivial today for most people ̶ until they need satellite imagery. It’s always going to include a cloud mask or other objects in that image by default. It’s an exciting and promising time to be in Earth observation.
Did you know that RGB bands are the common denominator when it comes to satellite imaging platforms? Or that people are still going through these images manually to detect clouds? Let me know if you’re just as curious about SAR as I am, and I’ll do my best to have an episode to demystify it.
Be sure to subscribe to our podcast for weekly episodes that connect the geospatial community.
For more exclusive content, join our email. No spam! Just insightful content about the geospatial industry.
Commercial satellite providers produce somewhere between 100 and 200 terabytes of imagery a day ̶ a monstrous amount of information. Sentinel 2 has five years of daily refresh data. We have 40+ years of Landsat data. It’s a massive amount, particularly in the temporal dimension, where you can do longitudinal studies. Apache Spark and Raster Frames might just be the tools we need to handle this much data.
With the open data movement, there’s an ubiquity of data. We can let students pick their own data on topics that interest them. They find their own data for a geographic area they’re interested in, perhaps where they live or where they’d love to travel. They make connections to their own interests and lives. The more they’ll see the relevance of what they’re learning, the more they’re motivated.