For detailed documentation and source code, visit this GitHub Repository.

For a more comprehensive understanding of the implementation, refer this notebook on Kaggle

The model is trained to recognize and classify the following classes:

Building: #3C1098
Land (unpaved area): #8429F6
Road: #6EC1E4
Vegetation: #FEDD3A
Water: #E2A929
Unlabeled: #9B9B9B

Project Overview

This project implements a semantic segmentation model for aerial imagery using the U-Net architecture. The U-Net model is a convolutional neural network specifically designed for image segmentation tasks, achieving high accuracy in segmenting various objects and features within images. The code in this repository leverages TensorFlow to implement the U-Net model and perform segmentation on aerial imagery.

U-Net architecture

The U-Net architecture is characterized by its unique encoder-decoder structure. The encoder progressively downsamples the input image using convolutional and pooling layers, extracting hierarchical feature maps. The decoder then upsamples these features, gradually reconstructing the segmented image. A key feature of U-Net is the use of skip connections, which combine feature maps from the encoder with corresponding decoder layers. This allows the model to retain fine-grained spatial information from the input, improving segmentation accuracy.

This architecture makes U-Net particularly effective for semantic segmentation tasks involving complex structures and boundaries, as seen in aerial imagery. The U-Net architecture was originally proposed in the paper “U-Net: Convolutional Networks for Biomedical Image Segmentation” by Olaf Ronneberger, Philipp Fischer, and Thomas Brox.

Dataset

This project utilizes the “Semantic segmentation of aerial imagery dataset”, hosted on Kaggle. This Public Domain dataset consists of aerial imagery of Dubai, UAE, captured by MBRSC satellites and annotated with pixel-wise semantic segmentation into 6 classes. The total volume of the dataset is 72 images grouped into 6 larger tiles.

Project Structure

Complete Notebook hosted on Kaggle: https://www.kaggle.com/code/eshansurendra/semantic-segmentation-using-u-net

This notebook provides a comprehensive implementation of the semantic segmentation project using the U-Net model, covering data preprocessing, model building, training, and prediction steps.

Results

The U-Net model, trained on the “Semantic segmentation of aerial imagery” dataset, achieved promising results on the segmentation task. This section presents the key performance metrics and visualizations to illustrate the model’s capabilities.

Performance Metrics:

Test Accuracy: 87%
Test Jaccard Coefficient (IoU): 0.7308
Test Loss: 0.8974
Validation Accuracy: 84%
Validation Jaccard Coefficient (IoU): 0.6869
Validation Loss: 0.9149

These metrics indicate that the model demonstrates strong generalization ability, as it performs well on both the training and validation sets. The high accuracy and IoU scores suggest that the model effectively learns to segment aerial images into distinct object categories.

Visualization of Segmentation Results:

Here are examples of segmentation predictions made by the model on test images: