Urban Estimation, Earth Observation And Deep Learning

Elizabeth Duffy is a project scientist for the DECIDER Project and a product marketing specialist for UP42.

Elizabeth has seven years of combined education and professional experience developing, implementing, and promoting geospatial tech for private and public applications.

She’s gotten her hands dirty with a bit of everything geospatial-related. Most recently, she’s been focusing on automating inside extraction with machine learning and interferometrics. For the latter, she published in a peer-reviewed journal recently.

WHAT ARE INTERFEROMETRICS?

Interferometrics, or InSAR, is taking stacks of radar data and looking at it over a certain timescale and seeing how the slight changes in elevation occur. It’s used for subsidence analysis and for understanding when there’s a risk — karstification or possible landslide. It’s looking at where there are minor changes in the movement of the Earth’s surface.

WHAT IS THE DECIDER PROJECT?

It’s a consortium of eight different universities across Germany and several partners based in Ho Chi Minh City, Vietnam.

They gather the most current data and analytics to create a decision support tool for flood risk in Ho Chi Minh City, where this is a significant problem for many reasons — from sea level rise to increasing precipitation or groundwater extraction due to urbanization.

CAN YOU MAKE SIGNIFICANT DECISIONS OR CALCULATIONS ABOUT WHAT MIGHT HAPPEN PURELY BASED ON THESE SLIGHT CHANGES IN THE EARTH’S SURFACE OVER TIME?

Or are you adding something else to the mix to help decide where infrastructure should be built or moved? What kinds of data layers are you using apart from SAR data?

Yes, we add other layers.

For example, other people in the research group are looking at geospatial data and financial metrics or hydrologic data.

My contribution to the DECIDER Project is to look at urban structure types to determine, first and foremost, where buildings are and where they are not.

This is a significant issue for rapidly developing cities — their governments lack the cadastre data a lot of places, like the United States or Europe, have readily available. It’s important to discern where these buildings are and what types they are.

The where component is the most important. We can see rapid urbanization happening in areas where we also assess their subsidence. We flag and evaluate how this trend contributes to future risks.

There are also illegal settlements along the rivers which governments turn a blind eye to — these may be more needy situations. We need to assess where these buildings are because a lot of times they’re along rivers where they’re prone to flooding.

WHY IS THIS A COMPLICATED PROBLEM TO SOLVE? CAN’T WE ALREADY DO THIS WITH SATELLITE DATA?

This is a complex problem.

Ho Chi Minh City is home to 9 million people and growing.

They need geospatial data that can scale and understand the situation at hand to discern the right solutions so the government can start implementing new infrastructure immediately and continue to monitor and adapt to changes.

When you have the right data, you can apply the right analytics to get the right solution.

WHAT KIND OF URBANIZATION IS HAPPENING THERE? WHAT BUILDINGS ARE BEING BUILT? IS THAT IMPORTANT?

Absolutely.

When you look at the types of buildings being built, you can use that to estimate how many people live in that building.

With the help of the other partners we work with, we also collect survey data like financial metrics, that we assess:

What is the quality of life? What is the expense of this building being damaged? How many people are likely to be affected by a flood? How much would they be willing to spend on protective measures — putting sandbags outside their buildings or shops or installing rain barrels?

HAVE YOU IDENTIFIED NOT JUST BUILDINGS BUT THE TYPES OF BUILDINGS?

Yes.

We outlined two different methods of attacking this problem.

First, the machine learning approach with various derivatives, like PCA analysis, topographic position index, water masking, and edge detection of where the buildings could be delineated. Plus an NDVI, but that’s relatively standard.

These components had to go in before even getting started with the machine learning process of identifying a building and where there wasn’t a building.

This excludes the actual categorization we have.

WHY DO YOU NEED TO CREATE THESE DERIVATIVES TO RUN A MACHINE LEARNING PROCESS?

The goal is to identify where the buildings are and where they aren’t.

It’s feeding these different types of data sets that can help the computer make the most educated guess over where a building is, or where it isn’t, into your machine learning algorithm.

It can use NDVI to tell you where the vegetation is and where it isn’t. But you might also end up including things you don’t want to include or exclude things you don’t want to exclude.

Perhaps a building with a green roof, a building with biomass or a tar-type roof, or even terracotta clay — the same material as soil on the ground.

We feed these different components into the machine learning algorithms to get the best quality of results.

IT DOESN’T LOOK LIKE MACHINE LEARNING WAS YOUR ANSWER. WAS THERE SOMETHING ELSE THAT YOU USED? OR DID YOU GO THROUGH THESE STEPS TO SOLVE THIS PROBLEM?

We also tested out what our opportunities to leverage deep learning were and where the most potential is.

We have these two datasets that we’re working with — high-resolution imagery and a normalized DSM — we derived these components with machine learning.

With the deep learning methods, you take the optical imagery, and you can train a database doing the same thing.

Sure, it takes time to build up a repository, but we have a well-performing urban estimation algorithm — produced by one of the partners of UP42. You can go to UP42 and analyze 75 square kilometers in four hours.

I just tested it today.

Imagine the time to build a framework to estimate where buildings are or not. With machine learning, you save three months of somebody’s time. Derivatives are prepared and adequately calibrated or executed, and a machine learning algorithm is built (i.e., cognition, ArcGIS pro capabilities, etc.) Looking things up manually is significantly cut down.

You build a repository, feed the data into the algorithm, and four hours later, you have a reasonable answer to start off your analysis.

WHAT DOES BUILDING UP A REPOSITORY LOOK LIKE? IS IT DIGITIZING DATA LAYERS; HIGH-RESOLUTION AERIAL IMAGERY BEING MANUALLY TRACED AROUND BUILDINGS BY HUMANS?

That’s generally how deep learning is done. You set up a framework with TensorFlow or PyTorch and somebody manually labels the data. After you get to a certain level of performance of training — on the data to be fed into the algorithm — and you do your analysis, you can go back and pick up where you left off to continue the training and refine the data.

It creates a positive feedback loop — the more that you train it, the better it gets. You can also have it region-specific. For example, within Southeast Asia, buildings can be similarly structured.

Southeast Asia is one of the most challenging locations to do building detection. Buildings are highly clustered, unlike in the US or in Australia, where everything is nice and neat — grids spaced equally.

CAN I JUST TAKE THIS AND USE IT FOR OTHER GEOGRAPHIC LOCATIONS WITHIN VIETNAM OR A SIMILAR GEOGRAPHIC AREA?

Sure. That’s the entire idea.

Especially for urban estimation, where you’re just answering the questions:

Is this a building? Is this not a building?

It can definitely be replicated for urban environments similar in geographies and infrastructure.

WHAT KIND OF GROUND-TRUTHING DO YOU DO TO VALIDATE THE MODEL AND THE RESULTS?

For geospatial data, that could look like taking a normalized DSM — subtract a DSM from a DTM — to get the heights of buildings. Plus, using LIDAR data generated from high-resolution imagery gives me a certain level of accuracy.

But I still need to go on-site to measure rooftop information, actual height information just to be sure that the answers I’m getting from my normalized DSM match what we can see in reality.

That’s the physical validation.

Then, there is the validation of financial metrics after the survey campaigns. We take these broad surveys with personal information, like how much does your roof cost? When you had repairs to your roof, how much did that cost? What is your income? Are you running a business from home? How many people live in your house?

We split the survey results up into what will be used to draw conclusions for what we think these buildings are actually matching up to versus what we’re going to use for the validation tests.

It’s essential to make sure that we split up that data correctly. We’re not using the same information that we make our hypothesis on for what we validate with.

It’s a general issue of scientific rigor, more than anything else, of ensuring that we split up the data sets properly.

SO YOU USE IN SITU MEASUREMENTS, FIELD MEASUREMENTS, SURVEYS. YOU GO OUT TO OBJECTS YOU’VE LOCATED AND SAY, “THIS IS AN OBJECT OF A CERTAIN SIZE, SHAPE, OR ROOF TYPE”. PLUS, WITH THE OTHER SURVEYS, YOU CAN MAKE ASSUMPTIONS, “BUILDINGS THAT LOOK LIKE THIS SHOULD BE GIVEN THESE ATTRIBUTES.”

Yes — but not necessarily, in that order.

We derived the buildings first, just to get an idea of the overall distribution of different buildings. Then, we did the survey and a building type analysis.

WHAT DO YOU IMAGINE THE RELATIONSHIP BETWEEN IN SITU MEASUREMENTS AND REMOTE SENSING WORK WILL LOOK LIKE IN THE FUTURE? DO YOU THINK WE WILL ALWAYS NEED TO DO THESE IN SITU MEASUREMENTS? OR WILL WE BE ABLE TO REMOTELY SENSE EVERYTHING?

In the past, we’ve relied heavily on in situ measurements and manual labor, which should still be leveraged — more than calibration and validation.

It shouldn’t be where our answers lie; it should be for us to correct where we’ve already determined some answers as we calibrate and validate the insights we’re getting.

In situ measurements and manual labor just don’t scale. Therefore, it makes less and less sense to use these on-site measurements for scaling.

HOW FAR ALONG IS THE PROJECT?

Our contribution to it will end in October 2021. After that, the entire DECIDER Project will extend for another two years to develop the data we extracted and interpreted.

They import it into a decision support tool — a web mapping platform anybody can access. So not only governmental stakeholders, or small businesses and corporations that have an interest, but everyday people as well so they can have the data they need to make proper decisions about flood risk in their community.

WHAT DECISIONS DO YOU THINK PEOPLE WILL MAKE WITH THIS?

It’s one contribution that will let people make better real estate decisions.

For governments, zoning could be pertinent for businesses to understand where their companies are — to understand future risks and invest in their infrastructure to make it more resilient.

Maybe they want to build rainwater catchment systems, a district-level community rain garden, or something they can have when there’s an excess of rainfall. The water has a place to flow into some place that, hopefully, isn’t their small business.

Questions like this are complex and have down-the-line implications for the decisions people make today. It’s eye-opening to understand how little we know about the entire socio-economic impact of Earth observation data and services, let alone how deep learning can contribute to our socio-economic well-being.

The best estimates people come out with are $100 to $150 billion over the next 10 years. That’s not even understanding what the full capabilities of deep learning will be. By then, satellite data will advance.

There’s potential out there; I’m excited that we’re helping people answer these questions. I hope people take on Earth observation data more and more.

THESE ARE BIG NUMBERS, A LOT OF DATA. MODELS RUN CONTINUOUSLY; IMPLEMENTATION TAKES TIME. IS THIS EXCITING OR OVERWHELMING FOR YOU?

A bit of both.

A recent global digital elevation model tripled the estimate of global vulnerability to sea-level rise and coastal flooding.

We already knew the problem of sea-level rise was going to be a big one. Now we understand that many of these buildings were creating a level of backscatter that was influencing what we thought the elevation was.

So it made these cities look like they were higher than they were.

Through machine learning, we have determined they’re much lower.

The amount we’ll have to have to spend in the future on infrastructure changes is daunting.

The excitement comes from understanding the more time that we invest in deep learning, the better it’s going to get. When we make the right decisions of where we’ll invest our infrastructure, it can save money in the long run.

Governments sometimes conclude, too quickly, that they need to look at things like dikes and levees and use in situ measurements that don’t scale to justify these projects, like flood risk.

In New Orleans in the aftermath of Hurricane Katrina, the levees did more damage than they did good after they broke.

If we use Earth observation data to help us make the right decisions for infrastructure, then we can invest in deep learning to understand the more that we invest, the better it’ll get, which will result in even better answers. That will be a more exciting situation.

Even though it is daunting, I see it as the David and Goliath situation where this is a Goliath of a problem but Earth observation data, especially when paired with deep learning, is the slingshot that will facilitate David’s success.

IS THERE EVER A TIME WHEN YOU FEEL LIKE YOU DON’T KNOW IF THESE PEOPLE AND GOVERNMENTS REALLY WANT THE ANSWER?

That can be the case sometimes, especially if there are private interests.

Too much groundwater extraction causes a lot of subsidence. We’ve seen it in Ho Chi Minh City, Beijing and even California quite extensively. There can be vested interests in not discovering how rapidly that subsidence is occurring.

People see what’s happening in their neighborhoods.

Things catch up with you.

Most governments do want to find the right answers. It’s just that not everybody recognizes these answers are readily achievable and scalable in the information and technology age we live in.

Right now, a lot of these technologies are very new. For example, the Pleiades data set wasn’t around a full decade ago. So it’s catching up a bit in raising awareness of what is achievable.

WHAT IS YOUR ROLE AS A SCIENTIST? TO DELIVER THE DATA TO DO THE RESEARCH? OR DO YOU SEE IT AS BEING SOMETHING MORE?

My primary goal is to deliver and explain the data and how our answers were achieved so that people understand how we’ve come to these conclusions. Not only so that they can try to replicate them and improve on them but also so that we demystify how the answers are actually drawn.

We want people to understand where there might be biases or flaws. For the DECIDER Project, we conducted surveys in some places and not others. There could be potential biases that we focused on areas we found to be more prone to flooding rather than those that weren’t just because we knew that was of the highest priority or our research purpose.

There is definitely the component of also conveying what that information actually tells us, and for advocating what the change should be.

We should take our best knowledge from our wheelhouse and invite private sector people and government bodies who are more attuned to the actual budgets that people have and what is actually financially reasonable for them to make those decisions on what’s best for their communities.

DO YOU HAVE THE FEELING THAT PEOPLE WILL THINK MACHINE LEARNING IS A CATCH-ALL “WE’LL JUST DO SOME MACHINE LEARNING, DEEP LEARNING, AND THAT WILL SOLVE OUR PROBLEM?”

In the scientific community, it can be the opposite for deep learning. People become scared that they don’t know what the computer actually does and understand its decisions within the system.

They can’t explain where the error occurs or why certain things are being detected better than other things. Maybe, the training data is biased itself.

In the scientific community, people can be afraid of deep learning and not see its potential.

With the demystification of applying these datasets within the private sector, they adopt new things a lot more readily. However, what we see at UP42 as being the larger issue is that people who are interested in exploring these things, such as developers, don’t have the resources to set up these frameworks.

It takes a lot of power to run deep learning algorithms. Hopefully, with more and more platforms, like UP42, popping up, we see greater recognition of the potential for deep learning.

SOMETIMES, THE TECHNICAL REQUIREMENTS JUST STOP PEOPLE FROM INNOVATING

For complex situations, such as the one in Ho Chi Minh City, that definitely can be the case.

But there are some opportunities and use cases, such as vegetation management. Unfortunately, it can be an enormous problem that goes unseen.

Utility companies do in situ measurements — people fly helicopters over kilometers upon kilometers of power lines to assess where there’s maybe some vegetation growing nearby. This is a significant issue — in 2017, 12 wildfires in California were caused directly by utility — dry vegetation interfering with electricity and causing sparks and wildfires.

And all people need to do is apply deep learning or machine learning to assess with optical imagery how close this vegetation is to a power line. Then, with the normalized DSM, they can see the height of a tree.

How close is it to a utility line?

And you have your answer. But, again, it’s a straightforward approach; there are many opportunities just like this example.

We do see developers working with utility companies to help protect the environment and protect their assets and reputation.

This is a one-and-done type situation. It doesn’t have to be complex — no field surveys, going to individual households and asking how much money they’re making, etc.

OH YES. THE “ALL YOU HAVE TO DO IS A QUICK X OR Y.” IT’S EASY TO SAY IT WHEN YOU KNOW WHAT YOU’RE DOING.

True. Developers and people I hang out with are highly familiar with the internet and web mapping.

The people who will benefit the most from this technology are the people working at the utility company, or in places like Ho Chi Minh City, it’s the people running the shops at a very low-lying elevation.

These are the people most affected. They must educate themselves — and they do, as we’ve seen people studying maps, analysis, and risk mitigation during COVID.

Daniel ODonohue

About the Author

I'm Daniel O'Donohue, the voice and creator behind The MapScaping Podcast ( A podcast for the geospatial community ). With a professional background as a geospatial specialist, I've spent years harnessing the power of spatial to unravel the complexities of our world, one layer at a time.

Recent Posts

Categories

Urban Estimation, Earth Observation and Deep Learning