Getting the Most out of U-Net Architecture for Glacier (Front) Segmentation

Glacier and ice sheets are currently contributing 2/3 of the observed global sea level rise. Many glaciers on glaciated regions, e.g., Antarctica, show already considerable ice mass loss in the last decade. Most of this mass loss is caused by dynamic adjustment of glaciers, with considerable glacier retreat and elevation change being the major observables. The continuous and precise extraction of glacier calving fronts is hence of paramount importance for monitoring the rapid glacier changes.
This project intends to bridge the gap for a fully automatic and end-to-end deep learning-based glacier (front) segmentation using synthetic aperture radar (SAR) imagery. U-Net has been recently used, in its simple form, for this task and showed promising results [1]. In this thesis, we would like to thoroughly study the fundamentals and incorporate more advanced ideas to improve the segmentation performance of the simple U-Net. In other words, this thesis investigates the approaches that enhances the image segmentation performance without deviating from the U-Net’s root architecture. The outcome of this thesis is expected to be a comparative study, similar to [11], on the Glacier (front) segmentation. To this end, the following ideas are going to be investigated:

1. Pre-processing: So far in the literature, simple denoising/multi-looking algorithms were used as pre-processing. It is interesting to conduct a more thorough study on the effect of some more pre-processing algorithms:

1.1. Attribute Profiles (APs) [2, 3] have resulted in performance enhancement for very high-resolution remote sensing image classification. They have been used on SAR image segmentation too [4]. Their extension, Feature Attribute Profiles [5], have been shown to outperform APs in the most scenarios. They have been also used for pixel-wise classification of SAR images [6]. We would like to study the performance of APs and their extension in SAR image segmentation. This task is optional and will be addressed if time allows. 1.2. There are multiple classical denoising algorithms like: median filter, Gaussian filter, Bilateral filter, Lee filter, Kuan filter, etc. The denoised images may be followed by the contrast enhancement algorithms, e.g., contrast limited adaptive histogram equalization (CLAHE). Different combinations will be studies quantitatively and qualitatively.

2. Different network architectures in the U-Net’s bottleneck:

2.1. dilated convolution (atrous convolution): dilated convolution [7] is shown to introduce multi-scaling to the network without increasing the number of parameters,
2.2. dilated Resnet [8],

2.3. pre-trained networks (VGG, Resnet, etc.),

3. Different Normalization Algorithms: One common issue in training Deep CNNs is the internal covariate shift, which is caused by the distribution change of input features. It results in both the training speed and performance to decrease. As a remedy, multiple normalization techniques have been proposed, like Batch Normalization, Instance Normalization, layer normalization, and group normalization [9]. In this thesis, we will study the effect of the algorithms above on the segmentation results of the U-Net, both qualitatively and quantitatively.
4. The most optimum loss function for this application:
• (Binary) Cross Entropy
• Dice coefficient
• Focal loss
• Weighted combination of the loss functions above
5. Effect of dropout and drop connect: In which layer is dropout the most effective one?Maybe using that in all layers is the best approach? Is using dropout in combination with normalization techniques (batch normalization) even advantageous?
6. Effect of different data augmentation techniques, e.g., flip, rotate, random crop, random transformation, etc. on the segmentation performance.
7. Effect of transfer learning:

7.1. Is pre-training the decoder, encoder, and bottleneck of the U-Net separately or all at once on other datasets beneficial? Is it effective to tackle the limited training data and the class-imbalance problem in the dataset?
7.2. The effect of transfer learning from the high quality images (quality factor=[1:3]) to the low quality ones (quality factor=[1:3]).

8. Improved architectures of U-Net: For a thorough review on some of the architecture in one place, please refer to Taghanaki et al. [11].

8.1. Feedforward Auto-Encoder 8.2. FCN
8.3. Seg-Net
8.4. U-Net
8.5. U-Net++ [10]
8.6. Tiramisu Network [12]

References

[1] Zhang et al. “Automatically delineating the calving front of Jakobshavn Isbræ from multitemporal TerraSAR-X images: a deep learning approach.” The Cryosphere 13, no. 6 (2019): 1729-1741.

[2] Dalla Mura, Mauro, et al. “Morphological attribute profiles for the analysis of very high resolution images.” IEEE Transactions on Geoscience and Remote Sensing 48.10 (2010): 3747-3762.
[3] Ghamisi, Pedram, Mauro Dalla Mura, and Jon Atli Benediktsson. “A survey on spectral–spatial classification techniques based on attribute profiles.” IEEE Transactions on Geoscience and Remote Sensing 53.5 (2014): 2335-2353.
[4] Boldt, Markus, et al. “SAR image segmentation using morphological attribute profiles.” The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 40.3 (2014): 39.
[5] Pham, Minh-Tan, Erchan Aptoula, and Sébastien Lefèvre. “Feature profiles from attribute filtering for classification of remote sensing images.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11.1 (2017): 249-256.
[6] Tombak, Ayşe, et al. “Pixel-Based Classification of SAR Images Using Feature Attribute Profiles.” IEEE Geoscience and Remote Sensing Letters 16.4 (2018): 564-567.
[7] Chen, Liang-Chieh, et al. “Rethinking atrous convolution for semantic image segmentation.” arXiv preprint arXiv:1706.05587 (2017).
[8] Zhang, Qiao, et al. “Image segmentation with pyramid dilated convolution based on ResNet and U-Net.” International Conference on Neural Information Processing. Springer, Cham, 2017.
[9] Zhou, Xiao-Yun, and Guang-Zhong Yang. “Normalization in training U-Net for 2-D biomedical semantic segmentation.” IEEE Robotics and Automation Letters 4.2 (2019): 1792-1799.
[10] Zhou, Zongwei, et al. “Unet++: A nested u-net architecture for medical image segmentation.” Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, Cham, 2018. 3-11.
[11] Taghanaki, Saed Asgari, et al. “Deep Semantic Segmentation of Natural and Medical Images: A Review.” arXiv preprint arXiv:1910.07655 (2019).
[12] Jégou, Simon, et al. “The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation.” Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2017.