Index
Transfer Learning for Re-identification on Chest Radiographs
Getting the Most out of U-Net Architecture for Glacier (Front) Segmentation
Glacier and ice sheets are currently contributing 2/3 of the observed global sea level rise. Many glaciers on glaciated regions, e.g., Antarctica, show already considerable ice mass loss in the last decade. Most of this mass loss is caused by dynamic adjustment of glaciers, with considerable glacier retreat and elevation change being the major observables. The continuous and precise extraction of glacier calving fronts is hence of paramount importance for monitoring the rapid glacier changes.
This project intends to bridge the gap for a fully automatic and end-to-end deep learning-based glacier (front) segmentation using synthetic aperture radar (SAR) imagery. U-Net has been recently used, in its simple form, for this task and showed promising results [1]. In this thesis, we would like to thoroughly study the fundamentals and incorporate more advanced ideas to improve the segmentation performance of the simple U-Net. In other words, this thesis investigates the approaches that enhances the image segmentation performance without deviating from the U-Net’s root architecture. The outcome of this thesis is expected to be a comparative study, similar to [11], on the Glacier (front) segmentation. To this end, the following ideas are going to be investigated:
1. Pre-processing: So far in the literature, simple denoising/multi-looking algorithms were used as pre-processing. It is interesting to conduct a more thorough study on the effect of some more pre-processing algorithms:
1.1. Attribute Profiles (APs) [2, 3] have resulted in performance enhancement for very high-resolution remote sensing image classification. They have been used on SAR image segmentation too [4]. Their extension, Feature Attribute Profiles [5], have been shown to outperform APs in the most scenarios. They have been also used for pixel-wise classification of SAR images [6]. We would like to study the performance of APs and their extension in SAR image segmentation. This task is optional and will be addressed if time allows. 1.2. There are multiple classical denoising algorithms like: median filter, Gaussian filter, Bilateral filter, Lee filter, Kuan filter, etc. The denoised images may be followed by the contrast enhancement algorithms, e.g., contrast limited adaptive histogram equalization (CLAHE). Different combinations will be studies quantitatively and qualitatively.
2. Different network architectures in the U-Net’s bottleneck:
2.1. dilated convolution (atrous convolution): dilated convolution [7] is shown to introduce multi-scaling to the network without increasing the number of parameters,
2.2. dilated Resnet [8],
2.3. pre-trained networks (VGG, Resnet, etc.),
3. Different Normalization Algorithms: One common issue in training Deep CNNs is the internal covariate shift, which is caused by the distribution change of input features. It results in both the training speed and performance to decrease. As a remedy, multiple normalization techniques have been proposed, like Batch Normalization, Instance Normalization, layer normalization, and group normalization [9]. In this thesis, we will study the effect of the algorithms above on the segmentation results of the U-Net, both qualitatively and quantitatively.
4. The most optimum loss function for this application:
• (Binary) Cross Entropy
• Dice coefficient
• Focal loss
• Weighted combination of the loss functions above
5. Effect of dropout and drop connect: In which layer is dropout the most effective one?Maybe using that in all layers is the best approach? Is using dropout in combination with normalization techniques (batch normalization) even advantageous?
6. Effect of different data augmentation techniques, e.g., flip, rotate, random crop, random transformation, etc. on the segmentation performance.
7. Effect of transfer learning:
7.1. Is pre-training the decoder, encoder, and bottleneck of the U-Net separately or all at once on other datasets beneficial? Is it effective to tackle the limited training data and the class-imbalance problem in the dataset?
7.2. The effect of transfer learning from the high quality images (quality factor=[1:3]) to the low quality ones (quality factor=[1:3]).
8. Improved architectures of U-Net: For a thorough review on some of the architecture in one place, please refer to Taghanaki et al. [11].
8.1. Feedforward Auto-Encoder 8.2. FCN
8.3. Seg-Net
8.4. U-Net
8.5. U-Net++ [10]
8.6. Tiramisu Network [12]
References
[1] Zhang et al. “Automatically delineating the calving front of Jakobshavn Isbræ from multitemporal TerraSAR-X images: a deep learning approach.” The Cryosphere 13, no. 6 (2019): 1729-1741.
[2] Dalla Mura, Mauro, et al. “Morphological attribute profiles for the analysis of very high resolution images.” IEEE Transactions on Geoscience and Remote Sensing 48.10 (2010): 3747-3762.
[3] Ghamisi, Pedram, Mauro Dalla Mura, and Jon Atli Benediktsson. “A survey on spectral–spatial classification techniques based on attribute profiles.” IEEE Transactions on Geoscience and Remote Sensing 53.5 (2014): 2335-2353.
[4] Boldt, Markus, et al. “SAR image segmentation using morphological attribute profiles.” The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 40.3 (2014): 39.
[5] Pham, Minh-Tan, Erchan Aptoula, and Sébastien Lefèvre. “Feature profiles from attribute filtering for classification of remote sensing images.” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11.1 (2017): 249-256.
[6] Tombak, Ayşe, et al. “Pixel-Based Classification of SAR Images Using Feature Attribute Profiles.” IEEE Geoscience and Remote Sensing Letters 16.4 (2018): 564-567.
[7] Chen, Liang-Chieh, et al. “Rethinking atrous convolution for semantic image segmentation.” arXiv preprint arXiv:1706.05587 (2017).
[8] Zhang, Qiao, et al. “Image segmentation with pyramid dilated convolution based on ResNet and U-Net.” International Conference on Neural Information Processing. Springer, Cham, 2017.
[9] Zhou, Xiao-Yun, and Guang-Zhong Yang. “Normalization in training U-Net for 2-D biomedical semantic segmentation.” IEEE Robotics and Automation Letters 4.2 (2019): 1792-1799.
[10] Zhou, Zongwei, et al. “Unet++: A nested u-net architecture for medical image segmentation.” Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, Cham, 2018. 3-11.
[11] Taghanaki, Saed Asgari, et al. “Deep Semantic Segmentation of Natural and Medical Images: A Review.” arXiv preprint arXiv:1910.07655 (2019).
[12] Jégou, Simon, et al. “The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation.” Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2017.
Detecting Defects on Transparent Objects using Polarization Cameras
The classification of images is a well known task in computer vision. However, transparent or semi-
transparent objects have several properties that can make computer vision tasks harder. Those objects
usually have less textures and sometimes strong reflections. Occasionally, different backgrounds make
it hard to recognize edges or the shape of an object. [1, 2]
To overcome these difficulties we use polarization cameras in this work. In contrast to ordinary cameras,
polarization cameras additionally record information about the polarization of the light rays. Most
natural light sources emit unpolarized light. By using a light source that emits polarized light, it is
possible to remove reflections or increase the contrast. Further it is known that the Angle of Linear
Polarization (AoLP) provides information about the normal of a surface [3].
In this work, we will use the Deep Learning approach and use Convolutional Neural Networks (CNNs)
to explore the following topics:
1. Comparison of different sorts of preprocessing:
• Using only raw data / reshaped raw data
• Using extra features Degree of Linear Polarization (DoLP) and AoLP
2. Influence of different light sources.
3. Comparison of different defect classes.
To evaluate the results we use different error metrics such as accuracy and f1, as well as gradient class
activation maps (GradCAM) [4].
The implementation should be done in Python.
References
[1] Agastya Kalra, Vage Taamazyan, Supreeth Krishna Rao, Kartik Venkataraman, Ramesh Raskar, and Achuta
Kadambi. Deep Polarization Cues for Transparent Object Segmentation. In 2020 IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR), pages 8599–8608, Seattle, WA, USA, June 2020. IEEE.
[2] Ilya Lysenkov, Victor Eruhimov, and Gary Bradski. Recognition and Pose Estimation of Rigid Transparent
Objects with a Kinect Sensor. page 8, 2013.
[3] Francelino Freitas Carvalho, Carlos Augusto de Moraes Cruz, Greicy Costa Marques, and Kayque Martins
Cruz Damasceno. Angular Light, Polarization and Stokes Parameters Information in a Hybrid Image Sensor
with Division of Focal Plane. Sensors, 20(12):3391, June 2020.[4] Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and
Dhruv Batra. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization.
International Journal of Computer Vision, 128(2):336–359, February 2020.
Deep Learning-Based Limited Data Glacier Segmentation using Bayesian U-Nets and GANs-based Data Augmentation
The main application that this thesis is focusing on is the segmentation of Glaciers and their calving front in synthetic aperture radar (SAR) images. Accurate pixel-wise ground truth for remote sensing images, including the SAR images, are scarce and very expensive to generate. On the other hand, depending on the application, the regions of interest that we want to segment may be a small part of the image and thus, introduce a severe class-imbalance problem to the segmentation pipeline. It is a universally acknowledged fact that supervised learning-based algorithms suffer from the limited training data and class imbalance, with the main drawback being the overfitting and the model’s high uncertainty. In this work, we want to address this issue in a two-fold approach:
1. Data Augmentation: Data augmentation is a natural approach to tackle the limited and imbalanced data problem in the supervised learning-based systems [1]. Generative adversarial networks came a long way and have shown great potential in generating natural looking images. Recently, they have been to augment and populate the training set [2, 3, 4, 5, 6]. In this thesis, we are interested in conducting a thorough study on the effect of different GANs variants for data augmentation and synthetically populating the limited training data, similar to [3, 7, 8].
2. Bayesian U-net: As already mentioned, the limited training data is a bottleneck for the supervised learning-based algorithms. Moreover, the tedious task of manual labeling of the images for generating the ground truth may cause inaccuracy in this process. Both the aforementioned problems introduce uncertainty to the model. If we can measure this uncertainty, we can use it in an active learning process to improve the learning process. Bayesian algorithms provide a quantitative value for this uncertainty. In the second part of this thesis, we adapt the Bayesian U-Net [10] and/or Bayesian Seg-Net [11] to our SAR Glacier segmentation dataset and measure the uncertainty maps for the images.
Finally, we compare our results from the sections above (1 and 2) with the state-of-the-art, both quantitatively and qualitatively.
References: [1] Davari, AmirAbbas, et al. “Fast and Efficient Limited Data Hyperspectral Remote Sensing Image Classification via GMM-Based Synthetic Samples,” in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 12, no. 7, pp. 2107-2120, July 2019, doi: 10.1109/JSTARS.2019.2916495. [2] Nejati Hatamian, Faezeh, et al. “The Effect of Data Augmentation on Classification of Atrial Fibrillation in Short Single-Lead ECG Signals Using Deep Neural Networks,” ICASSP 2020 – 2020 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1264-1268, doi: 10.1109/ICASSP40776.2020.9053800. [3] Neff, Thomas, et al. “Generative adversarial network based synthesis for supervised medical image segmentation.” Proc. OAGM and ARW Joint Workshop. 2017. [4] Neff, Thomas, et al. “Generative Adversarial Networks to Synthetically Augment Data for Deep Learning based Image Segmentation.” Proceedings of the OAGM Workshop 2018: Medical Image Analysis. Verlag der Technischen Universität Graz, 2018. [5] Caballo, Marco, et al. “Deep learning-based segmentation of breast masses in dedicated breast CT imaging: Radiomic feature stability between radiologists and artificial intelligence.” Computers in Biology and Medicine 118 (2020): 103629. [6] Qasim, Ahmad B., et al. “Red-GAN: Attacking class imbalance via conditioned generation. Yet another medical imaging perspective.” arXiv preprint arXiv:2004.10734 (2020). [7] Bailo, Oleksandr, DongShik Ham, and Young Min Shin. “Red blood cell image generation for data augmentation using conditional generative adversarial networks.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2019. [8] Pollastri, Federico, et al. “Augmenting data with GANs to segment melanoma skin lesions.” Multimedia Tools and Applications (2019): 1-18. [9] T. Qin, Z. Wang, K. He, Y. Shi, Y. Gao and D. Shen, “Automatic Data Augmentation Via Deep Reinforcement Learning for Effective Kidney Tumor Segmentation,” ICASSP 2020 – 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1419-1423, doi: 10.1109/ICASSP40776.2020.9053403. [10] Hiasa, Yuta, et al. “Automated muscle segmentation from clinical CT using Bayesian U-net for personalized musculoskeletal Modeling.” IEEE Transactions on Medical Imaging (2019). [11] Kendall, Alex, Vijay Badrinarayanan, and Roberto Cipolla. “Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding.” arXiv preprint arXiv:1511.02680 (2015).
Diffeomorphic MRI Image Registration using Deep Learning
State-of-the-art deformable image registration approaches achieve impressive results and are commonly used in diverse image processing applications. However, these approaches are computationally expensive even on GPUs [1] due to their requirement to solve an optimization problem for each image pair during registration [2]. Most Learning based methods either required labeled data or do not guarantee a diffeomorphic registration or deformation field reversibility [1]. Adrian V. Dalca et. al. presented an unsupervised Deep-Learning framework for diffeomorphic image registration named Voxelmorph in [1].
In this thesis the network described in [1] will be implemented and trained on Cardiac Magnetic Resonance images to build an application for fast diffeomorphic image registration. The results will be compared to state-of-the-art diffeomorphic image registration methods. Additionally the method will be evaluated by comparing segmented areas as well as landmark locations of co-registered images. Furthermore the method in [1] will be extended to a one-to-many registration method using the approach in [3] to fulfill the desire for motion estimation of anatomy of interest for increasingly available dynamic imaging data [3]. Data used in this thesis will be provided by Siemens Healthineers. The implementation will be done using a open source framework like PyTorch [4].
The thesis will include the following points:
• Literature research of the topic of state-of-the-art methods regarding diffeomorphic image registration and one to many registration
• Implementing a Neural Network for diffeomorphic image regstration and extending it to a one-to-many registration
• Comparison of the results with state-of-the-art image registration methods
[1] Balakrishnan, G., Zhao, A., Sabuncu, M. R., Guttag, J. V. & Dalca, A. V. VoxelMorph: A Learning Framework for
Deformable Medical Image Registration. CoRR abs/1809.05231. arXiv: 1809.05231. http://arxiv.org/abs/1809.05231 (2018).
[2] Ashburner, J. A fast diffeomorphic image registration algorithm. NeuroImage 38, 95 –113. ISSN: 1053-8119. http://www.sciencedirect.com/science/article/pii/S1053811907005848 (2007).
[3] Metz, C., Klein, S., Schaap, M., van Walsum, T. & Niessen, W. Nonrigid registration of dynamic medical imaging data using nD+t B-splines and a groupwise optimization approach. Medical Image Analysis 15, 238 –249. ISSN: 1361-8415. http://www.sciencedirect.com/science/article/pii/S1361841510001155 (2011).
[4] Paszke, A. et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. CoRR abs/1912.01703. arXiv: 1912.01703. http://arxiv.org/abs/1912.01703 (2019).
Content-based Image Retrieval based on compositional elements for art historical images
Absorption Image Correction in X-ray Talbot-Lau Interferometry for Reconstruction
X-ray Phase-Contrast Imaging (PCI) is an imaging technique that measures the refraction of X-rays created by an object. There are several ways to realize PCI, such as interferometric and analyzer-based methods [3]. In contrast to X-ray absorption imaging, the phase image provides high soft-tissue contrast.
The implementation by a grating-based interferometer enables measuring an X-ray absorption image, a differential phase image and a dark-field image [2, p. 192-205]. Felsner et al. proposed the integration of a Tablot-Lau Interferometer (TLI) into an existing clinical CT system [1]. Three different gratings are mounted between the X-ray tube and the detector: two in front of the object, one behind (see Fig. 1). Currently it is not possible to install gratings with a diameter of more than a few centimeters because of various reasons [1]. The consequence is that it is only possible to create a phase-contrast image for a small area.
Nevertheless, for capturing the absorption image the entire size of the detector can be used. However, the absorption image is influenced by the gratings as they induce inhomogeneous exposure of the X-ray detector.
Besides that, the intensity values change with each projection. The X-ray tube, detector and gratings are rotating around the object during the scanning process. Depending on their position, parts of the object are covered by grating G 1 for one period of the rotation but not always.
It is expected that the part of the absorption image covered by the gratings differs from the rest of the image in its intensity values. Also, a sudden change in the intensity values can be detected at the edge of the lattice. This may lead to artifacts in 3-D reconstruction.
In this work, we will investigate the anticipated artifacts in the reconstruction and implement (at least) one correction algorithm. Furthermore, the reconstruction results with and without a correction algorithm will be evaluated using simulated and/or real data.
References:
[1] L. Felsner, M. Berger, S. Kaeppler, J. Bopp, V. Ludwig, T. Weber, G. Pelzer, T. Michel, A. Maier, G. Anton, and C. Riess. Phase-sensitive region-of-interest computed tomography. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 137–144, Cham, 2018. Springer.
[2] A. Maier, S. Steidl, V. Christlein, and J. Hornegger. Medical Imaging Systems: An Introductory Guide, volume 11111. Springer, Cham, 2018.
[3] F. Pfeiffer, T. Weitkamp, O. Bunk, and C. David. Phase retrieval and differential phase-contrast imaging with low-brilliance X-ray sources. Nature Physics, 2(4):258–261, 2006.
Truncation-correction Method for X-ray Dark-field Computed Tomography
Grating-based imaging provides three types of images, an absorption, differential phase and dark-field image. The dark-field image provides structural information about the specimen at the micrometer and sub-micrometer scale. A dark-field image can be measured by a X-ray grating interferometer. For example the Talbot-Lau interferometer that consists of three gratings. Due to the small size of the gratings, truncation arises in the projection images. This becomes an issue, since it leads to artifacts in the reconstruction.
This Bachelor thesis aims to reduce truncation artifacts of dark-field reconstructions. Inspired by the method proposed by Felsner et al. [1] the truncated dark-field image will be corrected by using the information of a complete absorption image. To describe the correlation between absorption and the dark-field signal, the decomposition by Kaeppler et al. [2] will be used. The dark-field correction algorithm will be implemented in an iterative scheme and a parameter search and evaluation of the method will be conducted.
References:
[1] Lina Felsner, Martin Berger, Sebastian Kaeppler, Johannes Bopp, Veronika Ludwig, Thomas Weber, Georg Pelzer, Thilo Michel, Andreas Maier, Gisela Anton, and Christian Riess. Phase-sensitive region-of-interest computed tomography. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2018, pages 137–144, Cham, 2018. Springer International Publishing.
[2] Sebastian Kaeppler, Florian Bayer, Thomas Weber, Andreas Maier, Gisela Anton, Joachim Hornegger, Matthias Beckmann, Peter A. Fasching, Arndt Hartmann, Felix Heindl, Thilo Michel, Gueluemser Oezguel, Georg Pelzer, Claudia Rauh, Jens Rieger, Ruediger Schulz-Wendtland, Michael Uder, David Wachter, Evelyn Wenkel, and Christian Riess. Signal decomposition for x-ray dark-field imaging. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2014, pages 170–177, Cham, 2014. Springer International Publishing.
Helical CT Reconstruction with Bilateral Sinogram/Volume Domain Denoisers
Helical CT is the most commonly used CT scan protocol in clinical CT today. Helical CT generally applies a cone-beam scan in a spiral trajectory over the object to be scanned. The collected sinograms, and subsequently reconstructed volumes, contain some amount of noise due to fluctuations in the line integrals. Removing this noise is necessary for diagnostic image quality.
In previous research, we have developed a method, based on reinforcement learning, to denoise cone-beam CT. This method involved the use of denoisers in both the sinogram and the reconstructed image domain. The denoisers are bilateral filters with the sigma parameters tuned by a convolutional agent. The reconstruction was carried out by the FDK algorithm in the ASTRA toolbox.
Due to the lack of time, we had limited our previous research to the simpler problem of circular cone-beam CT. In this research internship, we hope to extend our method to denoise helical CT as well. Since helical CT uses cone-beam projections, we hope that our method will work out of the box without any retraining being needed.
The following tasks are to be conducted as part of this research internship:
- Develop methods to reconstruct helical CT for the given sinograms i.e. ADMM, WFBP
- Formulate and train a reinforcement learning task to train denoisers for helical CT in sinogram and volume domain
- Figure out ways to train tasks without ground truth volumes, to obtain image quality better than currently existing methods
- Train current volume based neural network solutions (GAN-3D, WGAN-VGG, CPCE3D, QAE, etc.) and compare the solutions.
Requirements:
- Knowledge of CT reconstruction techniques
- Understanding of reinforcement learning
- Experience with PyTorch for developing neural networks
- Experience with image processing. Knowledge of the ASTRA toolbox is a plus.
Investigating augmented filtering approaches towards noise removal in low dose CT
Noise removal in clinical CT is necessary to make images clearer and enhance the diagnostic quality of an image. There are several deep learning techniques designed to remove noise in CT, however, they have several thousand parameters, making the behavior difficult to comprehend. We attempt to alleviate this problem by using known denoising models to remove the noise.
Due to the non-stationary nature of the CT noise, it is natural that the image will require different noise filtering strengths at different points in the image. One way to ensure this is to tune the parameters at each point in the image. Since a ground truth cannot be established for pixelwise ideal parameter values, this task can be formulated as a reinforcement task, which maximizes the image quality. Our previous research established such an approach for the joint bilateral filter.
In this thesis, we aim to complete the following tasks:
- Develop a general reinforcement learning framework for parameter tuning problems in medical imaging.
- Experiment with different denoising models such as non-local means, and block matching 3D.
- Experiment with a parameter selection strategy to choose which parameters to include into the learning process
- Study the impact of parameter tuning on denoising, and of the denoising model on the parameter tuning and the overall image quality.
In this thesis, the AAPM Grand Challenge dataset and Mayo Clinic TCIA dataset will be used. Quality shall be measured using PSNR and SSIM, and perhaps IRQM.
Requirements:
- Some knowledge of image processing. Experience with image processing libraries is a plus
- Good knowledge of PyTorch and C++
- Understanding of CT reconstruction and CT noise
- Experience with deep Q learning