Removing the Complexity of Deep Learning
The last few years have seen a marked increase in the use of deep learning within the remote sensing community. As this emerging technology has started to catch on, professionals in the geospatial industry often ask, "What is deep learning and how can I use it for my application?" This article demystifies some of the unknowns around deep learning while showing examples of how we have applied it to remote sensing imagery at Harris Geospatial Solutions.
(Left) Orthophoto of a neighborhood in Port-au-Prince, Haiti after the Janurary 2010 earthquake;
(Right) Class activation map created in ENVI, showing areas of rubble identified from a deep-learning model.
The concept of deep learning has been around for many years, but only recently have people begun to explore its full potential for solving geospatial problems with imagery. When we consider how deep learning applies to images, we often think of object recognition such as the ability to identify faces or vehicles from digital photographs. While that is still a popular use, there is a growing need to identify and categorize objects over a large geographic area. An internet search for "deep learning in remote sensing" reveals some of the applications where it has been used to date—namely, image classification, vegetation mapping, and urban planning. So what is deep learning and why is there so much hype around it?
Deep learning is really just a sophisticated form of machine learning that enables a system to automatically discover representations in data. It can continually improve predictions on its own without extensive guidance. It learns patterns by progressing through multiple layers in a neural network in order to draw conclusions, similar to how the brain processes information. When applied to remote sensing imagery, it can be used to find features such as vehicles, utility structures, or road markings. Over a larger scale, it can be used to find specific land-use patterns, road networks, and clouds in optical imagery. The result is a special type of classification image called a class activation map that indicates the probability of each pixel matching a given feature. The following figure shows an example that identifies vehicles in a high-resolution orthophoto.
Compared to traditional supervised classification methods such as Support Vector Machine (SVM), deep learning can extract more robust representations of features, which improves classification accuracy. Deep-learning algorithms are well-suited to extract features from a complex background, regardless of their shape, color, size, and other attributes.
As with any classification problem that involves training a neural network, users must provide samples of the features they are interested in—a process referred to as labeling. As the amount of data from small satellites and drones continues to grow exponentially over time, providing labels of features can become costly and time-consuming. Once the labels have been created, how are they input to a deep-learning model so that it can be trained to identify the same features in other images? Again, an internet search on this subject reveals a steep learning curve with lots of complex diagrams and unfamiliar terms. Researchers sometimes develop their own algorithms and architectures, but mostly they use open-source libraries for deep learning, which involves extensive programming in Python or C++.
The ENVI Deep Learning module, available in May of 2019, is designed specifically to overcome these limitations and make deep learning more widely available to the mainstream remote sensing community. It leverages the widely used and proven TensorFlow™ deep-learning technology without requiring users to write a single line of API code. Instead, a simple user interface guides users through the process of creating a labeled dataset, training a model, and creating a class activation map of the result:
Here is an example of how deep learning can be used in remote sensing: Suppose you want to identify all of the rows of agricultural crops in an image. In many areas of the world, crops are planted along curved rows. This makes it difficult to automatically extract the rows using traditional classification methods. A deep-learning model would be perfect for this task. However, labeling rows by hand could take hours just to provide training samples for the model. To show how ENVI's analytics can be used to solve this problem, two small spatial subsets were selected for training from an image of an agricultural field that was 4200 x 6400 pixels in size. In each subset, the ENVI Region of Interest (ROI) Tool was used to draw polylines along the crop rows. This labeling process only took a few minutes. The labeled examples were used to train a deep-learning model to identify the remaining crop rows in the full image (shown with blue lines below).
Using only a handful of labeled examples, the model learned to identify all of the crop rows. Training was a one-time process. The trained model can now be applied to other, similar images.
ENVI's preprocessing tools augment the deep-learning process. Preprocessing tools such as calibration, stretching, and color space transformation create consistent data needed for deep-learning models. Spectral classification and target detection tools can be used to create labeled datasets without the need to hand-draw ROIs on images. Some of our engineers experimented with using building footprints from OpenStreetMap® as input to a deep-learning model for rooftop extraction in a large urban scene. The following image shows the resulting class activation map overlaid on an orthophoto:
The Deep Learning module was designed to hide the complexity of convolutional neural networks from image analysts who regularly use ENVI. Yet it allows users who want more control over the training process to fine-tune parameters to achieve the best accuracy. Users can also take advantage of the ENVITask API framework and ENVI Modeler to customize deep-learning workflows. The image-driven insights that ENVI Deep Learning provides will help professionals solve geospatial problems that can't be solved with GIS data alone.