Unlocking the Potential of Deep Learning in the ArcGIS System

April 18, 2023

Mohamed Ahmed

How can you unlock the potential of big data? With the deep learning tools in ArcGIS, you can build algorithms that can make predictions and find patterns in massive amounts of data. Learn best practices in this cutting-edge field of AI to ensure your algorithms are efficient and produce accurate results.

Artificial Intelligence (AI), Machine Learning, and Deep Learning are transforming the way we live and work. With their capacity to process vast amounts of data and perform tasks faster and sometimes more accurately than humans, these cutting-edge technologies are creating new possibilities and predicting upcoming trends in key fields including healthcare, finance, technology, and geography. Although AI, machine learning, and deep learning are related and can overlap in some ways, there are key differences. In this blog post, I will define these terms to provide a clearer understanding of them, with a particular focus on deep learning and best practices to unlock its full potential in the ArcGIS system.

Differences between AI, Machine Learning, and Deep Learning

Figure 1 illustrates the intersection between the three concepts. AI is a broad field that encompasses various approaches to making computers perform tasks that normally require human intelligence. Machine learning is a subset of AI that involves training computers to learn from data and find patterns. Deep Learning is a type of machine learning that uses artificial neural networks to learn and make predictions, particularly for complex tasks like image and speech recognition.

AI, machine learning and deep learning can be visualized as a set of concentric circles, with AI in the outermost circle and deep learning in the innermost circle.

Figure 1: Intersection between AI, Machine Learning, and Deep Learning.

Why Does Deep Learning Matter and How Does it Work?

Deep learning has revolutionized many aspects of our society over the last decade, from identifying and classifying objects in images and allowing fluent conversations with Siri and Alexa, to being a key technology behind autonomous cars, enabling them to recognize a stop sign or to distinguish a pedestrian from a lamppost. Deep learning uses artificial neural networks, which are computer algorithms inspired by the structure and function of the human brain. A neural network, illustrated in Figure 2, is composed of multiple layers of interconnected nodes, each of which performs a simple calculation on the data it receives. The outputs of one layer serve as inputs to the next, allowing the network to learn increasingly complex relationships between the inputs and the outputs. This is why it is called "deep" learning: the network is able to learn deep representations of the data and make predictions based on them.

The input layer of a neural network consists of a set of nodes, each of which is connected to nodes in the next layer. Sets of nodes within the neural network between the input and output layers are called hidden layers.

Figure 2: Artificial neural networks consist of layers of interconnected nodes.

Traditional neural networks only contain 2-3 hidden layers, while more complex and deep networks can have as many as 150 or more hidden layers. Every hidden layer increases the complexity of the learned features. For example, in an image recognition neural network, the first hidden layer could learn how to detect edges and the last how to detect the specific complex shapes of the object you are trying to recognize.

Differences between Machine Learning and Deep Learning

Deep learning is a specialized form of machine learning. While a machine learning workflow requires manual extraction of relevant features from images, a deep learning process automatically extracts these features and uses them to create a model that categorizes objects in the image (as shown in Figure 3). One of the defining characteristics of deep learning is its ability to improve continuously over time as it is exposed to more data. This makes it particularly well-suited for complex and rapidly changing applications, where traditional machine learning methods may struggle to keep up.

In machine learning (upper workflow), the objects to be classified, for example bicycles, must first be manually extracted. In deep learning (lower workflow), object identification is performed by the neural network.

Figure 3: A machine learning approach to categorize bicycles (top) versus deep learning (bottom).

Integration of AI in ArcGIS

The integration of AI in Geographic Information Systems (GIS) has resulted in improved data analysis, allowing for better decision-making and solutions to real-world problems in various fields, such as emergency response and disaster management. Additionally, AI-powered GIS can automate time-consuming tasks, freeing up resources for more critical activities. ArcGIS is an interoperable system that allows the integration of complementary methods and techniques using Python or R. In ArcGIS Pro, machine learning is a core component of many geoprocessing tools that help solve problems in classification (e.g., Support Vector Machine), clustering (e.g., Density-based Clustering), and prediction (e.g., Forest-based Classification and Regression).

The key benefits of using deep learning in ArcGIS are not only that you have the ability to analyze a variety of geospatial data (e.g., satellite imagery, motion imagery, and more) but also that you can do an end-to-end workflow (Figure 4): from preparing data and exploratory data analysis and training the model, to performing spatial analysis and disseminating results with colleagues and stakeholders through web layers, maps, apps, and dashboards. In addition, the ArcGIS Living Atlas provides access to a large collection of Esri-curated and pre-trained deep learning models that can be valuable for your deep learning workflow.

An example of a deep learning workflow that can be performed in ArcGIS starts with image management, then labelling and data preparation, model training, inferencing from the model, analysis of the results, and finally sharing the results with fields workers or through monitoring applications.

Figure 4: An end-to-end workflow using deep learning in the ArcGIS system.

Best Practices for Deep Learning

Deep learning is a rapidly evolving field. To unlock its full potential, it is important to follow best practices, which include the following steps:

Choose the appropriate hardware: Deep learning algorithms require significant computing resources, so it's important to choose hardware that can handle the demands of the chosen model. We recommended a graphics processing unit (GPU) with enough memory (8 GB or more) to accelerate the processing time.
Set up data storage: Properly designed and maintained infrastructure is needed to store and manage the large amounts of data required for training deep learning models. This is a critical step to ensure the data is properly formatted, efficiently stored, scalable, and affordable. One of these modern data storage options for deep learning is cloud storage (e.g., Amazon S3, Google Cloud, and Microsoft Azure).
Preprocess data: Preprocessing the raw data can help improve the performance of deep learning models. This can include tasks such as normalization and augmentation.
Select high-quality training data: In order for deep learning algorithms to be effective, they need to be trained on high-quality data. Training data should be diverse and representative of the problem you are trying to solve. Tools in the Training Samples Manager in ArcGIS Pro can be a great help at this step.
Split your data into training and validation sets: You need to have a validation data set in order to properly evaluate the performance of your model. A common split is 90% training and 10% validation. However, the optimal split depends upon factors such as the structure of the model and the number of training data.
Select a model: There are many different types of deep learning algorithms, each with their own strengths and weaknesses. To select the best model for a particular problem, consider the type of data you are working with, the nature of the problem, and the computational resources available. You can learn more about the available deep learning models and their use cases in the ArcGIS Pro online documentation.
Tune hyperparameters: Hyperparameters are parameters that are set before training the model (i.e., not learned from data during the training), and they control the learning process and the overall architecture of the model. Some of the common hyperparameters in deep learning are learning rate, batch size, number of epochs, and activation function.
Monitor model performance: Regularly monitoring a model’s performance will help ensure that it is functioning as expected. This can include monitoring metrics such as accuracy, training loss, and validation loss. In ArcGIS Pro, the Compute Accuracy For Object Detection tool calculates the accuracy of a deep learning model by comparing the detected objects to ground reference data.
Early stopping: You can also include monitoring the performance of the model on a validation set and stopping training when the performance begins to degrade (e.g., loss begins to increase or accuracy begins to decrease). This approach prevents overfitting, where the model becomes too specialized to the training data and performs poorly on unseen data.
Test your model on new data: It's important to test your model on a dataset that it has not seen before in order to get a more realistic estimate of its performance. This can help you identify any issues or limitations with your model.