Index
Deep Learning-Based Limited Data Glacier Segmentation using Bayesian U-Nets and GANs-based Data Augmentation
The main application that this thesis is focusing on is the segmentation of Glaciers and their calving front in synthetic aperture radar (SAR) images. Accurate pixel-wise ground truth for remote sensing images, including the SAR images, are scarce and very expensive to generate. On the other hand, depending on the application, the regions of interest that we want to segment may be a small part of the image and thus, introduce a severe class-imbalance problem to the segmentation pipeline. It is a universally acknowledged fact that supervised learning-based algorithms suffer from the limited training data and class imbalance, with the main drawback being the overfitting and the model’s high uncertainty. In this work, we want to address this issue in a two-fold approach:
1. Data Augmentation: Data augmentation is a natural approach to tackle the limited and imbalanced data problem in the supervised learning-based systems [1]. Generative adversarial networks came a long way and have shown great potential in generating natural looking images. Recently, they have been to augment and populate the training set [2, 3, 4, 5, 6]. In this thesis, we are interested in conducting a thorough study on the effect of different GANs variants for data augmentation and synthetically populating the limited training data, similar to [3, 7, 8].
2. Bayesian U-net: As already mentioned, the limited training data is a bottleneck for the supervised learning-based algorithms. Moreover, the tedious task of manual labeling of the images for generating the ground truth may cause inaccuracy in this process. Both the aforementioned problems introduce uncertainty to the model. If we can measure this uncertainty, we can use it in an active learning process to improve the learning process. Bayesian algorithms provide a quantitative value for this uncertainty. In the second part of this thesis, we adapt the Bayesian U-Net [10] and/or Bayesian Seg-Net [11] to our SAR Glacier segmentation dataset and measure the uncertainty maps for the images.
Finally, we compare our results from the sections above (1 and 2) with the state-of-the-art, both quantitatively and qualitatively.
References: [1] Davari, AmirAbbas, et al. “Fast and Efficient Limited Data Hyperspectral Remote Sensing Image Classification via GMM-Based Synthetic Samples,” in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 12, no. 7, pp. 2107-2120, July 2019, doi: 10.1109/JSTARS.2019.2916495. [2] Nejati Hatamian, Faezeh, et al. “The Effect of Data Augmentation on Classification of Atrial Fibrillation in Short Single-Lead ECG Signals Using Deep Neural Networks,” ICASSP 2020 – 2020 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1264-1268, doi: 10.1109/ICASSP40776.2020.9053800. [3] Neff, Thomas, et al. “Generative adversarial network based synthesis for supervised medical image segmentation.” Proc. OAGM and ARW Joint Workshop. 2017. [4] Neff, Thomas, et al. “Generative Adversarial Networks to Synthetically Augment Data for Deep Learning based Image Segmentation.” Proceedings of the OAGM Workshop 2018: Medical Image Analysis. Verlag der Technischen Universität Graz, 2018. [5] Caballo, Marco, et al. “Deep learning-based segmentation of breast masses in dedicated breast CT imaging: Radiomic feature stability between radiologists and artificial intelligence.” Computers in Biology and Medicine 118 (2020): 103629. [6] Qasim, Ahmad B., et al. “Red-GAN: Attacking class imbalance via conditioned generation. Yet another medical imaging perspective.” arXiv preprint arXiv:2004.10734 (2020). [7] Bailo, Oleksandr, DongShik Ham, and Young Min Shin. “Red blood cell image generation for data augmentation using conditional generative adversarial networks.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2019. [8] Pollastri, Federico, et al. “Augmenting data with GANs to segment melanoma skin lesions.” Multimedia Tools and Applications (2019): 1-18. [9] T. Qin, Z. Wang, K. He, Y. Shi, Y. Gao and D. Shen, “Automatic Data Augmentation Via Deep Reinforcement Learning for Effective Kidney Tumor Segmentation,” ICASSP 2020 – 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1419-1423, doi: 10.1109/ICASSP40776.2020.9053403. [10] Hiasa, Yuta, et al. “Automated muscle segmentation from clinical CT using Bayesian U-net for personalized musculoskeletal Modeling.” IEEE Transactions on Medical Imaging (2019). [11] Kendall, Alex, Vijay Badrinarayanan, and Roberto Cipolla. “Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding.” arXiv preprint arXiv:1511.02680 (2015).
Investigating augmented filtering approaches towards noise removal in low dose CT
Noise removal in clinical CT is necessary to make images clearer and enhance the diagnostic quality of an image. There are several deep learning techniques designed to remove noise in CT, however, they have several thousand parameters, making the behavior difficult to comprehend. We attempt to alleviate this problem by using known denoising models to remove the noise.
Due to the non-stationary nature of the CT noise, it is natural that the image will require different noise filtering strengths at different points in the image. One way to ensure this is to tune the parameters at each point in the image. Since a ground truth cannot be established for pixelwise ideal parameter values, this task can be formulated as a reinforcement task, which maximizes the image quality. Our previous research established such an approach for the joint bilateral filter.
In this thesis, we aim to complete the following tasks:
- Develop a general reinforcement learning framework for parameter tuning problems in medical imaging.
- Experiment with different denoising models such as non-local means, and block matching 3D.
- Experiment with a parameter selection strategy to choose which parameters to include into the learning process
- Study the impact of parameter tuning on denoising, and of the denoising model on the parameter tuning and the overall image quality.
In this thesis, the AAPM Grand Challenge dataset and Mayo Clinic TCIA dataset will be used. Quality shall be measured using PSNR and SSIM, and perhaps IRQM.
Requirements:
- Some knowledge of image processing. Experience with image processing libraries is a plus
- Good knowledge of PyTorch and C++
- Understanding of CT reconstruction and CT noise
- Experience with deep Q learning
End-to-End Gaze Estimation Network for Driver Monitoring
Modelling of Speech Aspects in Parkinson’s Disease by Multitask Deep Learning
Parkinson’s disease is a progressive neurodegenerative disorder with a variety of motor and non-
motor symptoms. Although also other factors are influenced by the disease, the current evaluation
process relies mostly on motor aspects and is often subjective. While speech deficits can be
found in a majority of patients, its analysis is still underrepresented in the clinical assessment. To
increase objectivity and enable long-term monitoring of the patient’s status, several computational
methods have been been proposed in the literature. Along with the success of deep learning,
multitask techniques received more and more attention in recent years. Hence, this Master’s thesis
proposes the use of a multitask neural network-based approach in order to assess multiple aspects
of Parkinsonian speech. The data set included various recordings in numerous sessions obtained
from 94 Parkinson patients and 87 healthy controls. A defined set of statistical features was
extracted for each utterance to be used as input to the model. The multitask setting was defined
with three tasks regarding the distinction between diseased and healthy, as well as, two common
Parkinson rating scales, namely the Movement Disorder Society – Unified Parkinson’s Disease
Rating Scale and the modified Frenchay Dysarthria Assessment. These tasks were supposed to
be optimized together compared to individual networks. In order to get a deeper understanding
with regard to the influence of each task and the specific recording settings, several experiments
with different focuses were conducted. Additionally, the multitask setting was expanded with four
additional tasks to exploit the variability of this method. The experimental results demonstrate the
classification capabilities with accuracy values of 81.73%, 52.45% and 43.56% for the respective
three tasks based on a per session evaluation. These results improve the outcome of individually
trained networks for values between 3 and 16 percent points. Further comparison against an
Adaboost baseline does not show a clear improvement, however, the proposed model delivers
competitive results, especially with focus on other neural network approaches. Thus, this work
gives new insights to the application of multitask deep learning to Parkinsonian speech and builds
the basis for further research in the field
Deep-learning-based MR image denoising considering noise maps as supplementary input
Introduction
In magnetic resonance (MR) images, noise is a common issue which can lead to degraded image
quality and reduced clinical value. The signal-to-noise ratio (SNR) of an image is directly proportional
to specific factors such as the magnetic field strength or the scan acquisition time but increasing those
makes the examination more expensive. Therefore, especially for low-field MR imaging, denoising
techniques can be used to improve the SNR and thus increase the diagnostic value of the resulting
images. The aim of this thesis is to implement a deep-learning-based denoising approach which
operates on reconstructed MR images using corresponding noise maps as supplementary input.
Methods and data
The data for this work is based on internal sources of Siemens Healthineers. There are around 10,000
2-D slice images of 862 studies available which were acquired with 1.5 T or 3 T MRI scanners.
Corresponding noise maps, i.e. spatially resolved image maps showing the standard deviation of
the underlying image noise, were calculated from the image data. To simulate lower field strengths,
synthetic noise will be added to the available image data.
In general, supervised deep learning methods are more straight forward than unsupervised methods, but
good ground truth data (i.e., noise-free images) is often hard to obtain for medical imaging applications.
Metzler et al. [1] proposed using Stein’s unbiased risk estimator (SURE) to train convolutional neural
networks for image denoising without any ground truth data. They have shown that SURE can be
applied to compute the mean-squared- error loss associated with an estimate of the noiseless ground
truth image under the assumption that the noise is normally distributed. Zhussip et al. [2] applied
SURE for unsupervised training of image recovery and simultaneous denoising with undersampled
compressed sensing measurements.
The goal of this thesis is to adapt a neural network for denoising using SURE loss and investigate the
benefits of including the noise map of an MR image as supplementary input. This approach will be
compared with standard supervised and unsupervised methods, such as Noise2Void [3] and Noise2Self
[4], which require nothing but the noisy data as input. For the supervised approach, the original 3 T
images might be used as ground truth and the images with added, synthetic noise to simulate lower
field strengths as input data.
Evaluation
The following aspects will be evaluated:
Different neural network architectures will be implemented and compared w.r.t their denoising
performance.
The SURE-based approach will be compared to other proposed unsupervised or supervised deep
learning methods (e.g., Noise2Void, Noise2Self) as well as conventional denoising algorithms.
An extended evaluation of the network’s performance will be conducted on unseen image data.
References
[1] C. Metzler, A. Mousavi, R. Heckel, and R.G. Baraniuk. Unsupervised learning with stein’s unbiased risk
estimator. arXiv:1805.10531, 2020.
[2] M. Zhussip, S. Soltanayev, and S.Y. Chun. Training deep learning based image denoisers from undersampled
measurements without ground truth and without image prior. CVPR, pages 10255–10264, 2019.
[3] A. Krull, T.O. Buchholz, and F. Jug. Noise2void – learning denoising from single noisy images. CVPR,
pages 2129–2137, 2019.
[4] J. Batson and L. Royer. Noise2self: Blind denoising by self-supervision. PLMR, 97:524–533, 2019.
Mobile 3D-Shape Estimation in Telemedical Dermatologic Diagnosis and Documentation
Abstract: Imaging techniques for dermatology have been explored from mobile two-dimensional (2D)
RGB images to professional clinical imaging systems measuring three-dimensional (3D) information,
even beyond the epidermis, using techniques like ‘optical coherence tomography’ or laser-based
scanners. Nevertheless, to accomplish structural topographic depth measurements, technologies
lack to provide a mobile and precise 3D imaging system using only the hardware of a smartphone
or tablet. Previously a lot of approaches using ‘structure from motion’ or ‘structured light’ have been
explored for mobile wound documentation and measurement. However, to perform more
comprehensive 3D scanning of not only large chronic wounds, I want to investigate the use of mobile
‘phase-measuring deflectometry’ for dermatologic evaluation of skin to find potential biomarkers to
accompany a treatment or diagnosis based on a mobile app for both medical personnel and
untrained patients. During my master thesis I want to work on potential hardware adjustments for
dermatologic use, perform experiments for data acquisition, and design algorithms to process and
possibly classify the obtained data. Finally, I want to develop a fully functional application to create
and process 3D images, build up a database and enable a communication platform between doctor
and patient. My master thesis will jointly be conducted by my home university, Friedrich-Alexander-
Universität Erlangen-Nürnberg (FAU), and Northwestern University (NU).
Thrombus Detection in Non-Contrast Head CT using Graph Deep Learning
Thesis Description
Stroke is a severe cerebrovascular disease and one of the major causes of death and disability worldwide [1].
For patients suering from acute stroke, rapid diagnosis and immediate execution of therapeutic measures are
crucial for a successful recovery. In clinical routine, Non-Contrast Computed Tomography (NCCT) is typically
acquired as a rst-line imaging tool to identify the type of the stroke. In case of an acute ischemic infarct,
appropriate therapy planning requires an accurate detection and localization of the occluding blood clot. An
automated detection system would decrease the probability to miss an obstruction, save time and improve the
overall clinical outcome.
Several methods have been proposed to detect large vessel occlusion (LVO) using enhanced CT data like CT
angiography (CTA) [2, 3, 4]. CTA is mainly used in addition to NCCT and enables accurate evaluation of the
occlusion [5]. Nevertheless, studies have shown that the thrombus which causes the occlusion can be detected in
NCCT images due to its abnormal high density structure [6]. Classication from NCCT data can be achieved
by using Convolutional Neural Networks (CNNs) [7]. However, LVOs account for only 24% to 46% of acute
ischemic strokes [8]. Recent approaches for automated intracranial thrombus detection in NCCT are based on
Random Forest classication or CNNs [9, 10]. The results are promising, but further improvement is required
to ensure utility in clinical routine.
This thesis aims to achieve higher reliability in detecting the thrombus on NCCT data, assuming clot localization
in the entire cerebrovascular system. More specically, the goal is to build and improve upon an
existing detection model which applies a 2D U-Net to the slices of a volumetric dataset, consisting of multiple
channels that had been extracted from the raw CT dataset. The locations of the 15 local maxima with the
highest probability in the resulting prediction map are used as potential candidates for the nal prediction
of the thrombus location. The model to be developed shall classify each candidate (as clot / no clot) while
comprehensively considering all candidates found in the patient as well as corresponding regions on the opposite
hemisphere, as this is considered crucial context for the decision. To this end, a region of interest is extracted
around each candidate position and its opposite position obtained by mirroring at the brain mid plane. Each
such region is considered a node and connected with others to form a graph that describes all regions of interest
in a patient. As such, the problem is formulated as a (partial) node classication and graph neural network
models will be investigated to solve it.
In summary, this thesis will comprise the following work items:
Literature research of state-of-the-art methods for automated thrombus detection
Extraction of suitable regions of interest based on previously detected clot candidates
Design and implementation of a (graph) neural network architecture for joint classication of all clot
candidates in a patient
Investigation of multiple graph structures and model architectures
Master Thesis Antonia Popp
Optimization and evaluation of the deep learning model
References
[1] Walter Johnson, Oyere Onuma, Mayowa Owolabi, and Sonal Sachdev. Stroke: a global response is needed.
Bulletin of the World Health Organization, pages 94:634{634A, 2016.
[2] Sunil A. Sheth, Victor Lopez-Rivera, Arko Barman, James C. Grotta, Albert J. Yoo, Songmi Lee,
Mehmet E. Inam, Sean I. Savitz, and Luca Giancardo. Machine learning-enabled automated determination
of acute ischemic core from computed tomography angiography. Stroke, 50(11):3093{3100, 2019.
[3] Matthew T. Stib, Justin Vasquez, Mary P. Dong, Yun Ho Kim, Sumera S. Subzwari, Harold J. Triedman,
Amy Wang, Hsin-Lei Charlene Wang, Anthony D. Yao, Mahesh Jayaraman, Jerrold L. Boxerman, Carsten
Eickho, Ugur Cetintemel, Grayson L. Baird, and Ryan A. McTaggart. Detecting large vessel occlusion at
multiphase CT angiography by using a deep convolutional neural network. Radiology, page 200334, 2020.
[4] Midas Meijs, Frederick J. A. Meijer, Mathias Prokop, Bram van Ginneken, and Rashindra Manniesing.
Image-level detection of arterial occlusions in 4D-CTA of acute stroke patients using deep learning. Medical
image analysis, 66:101810, 2020.
[5] Michael Knauth, Rudiger von Kummer, Olav Jansen, Stefan Hahnel, Arnd Dor
er, and Klaus Sartor.
Potential of CT angiography in acute ischemic stroke. American journal of neuroradiology, 18(6):1001{
1010, 1997.
[6] G. Gacs, A. J. Fox, H. J. Barnett, and F. Vinuela. CT visualization of intracranial arterial thromboembolism.
Stroke, 14(5):756{762, 1983.
[7] Manon L. Tolhuisen, Elena Ponomareva, Anne M. M. Boers, Ivo G. H. Jansen, Miou S. Koopman, Renan
Sales Barros, Olvert A. Berkhemer, Wim H. van Zwam, Aad van der Lugt, Charles B. L. M. Majoie,
and Henk A. Marquering. A convolutional neural network for anterior intra-arterial thrombus detection
and segmentation on non-contrast computed tomography of patients with acute ischemic stroke. Applied
Sciences, 10(14):4861, 2020.
[8] Robert C. Rennert, Arvin R. Wali, Jerey A. Steinberg, David R. Santiago-Dieppa, Scott E. Olson, J. Scott
Pannell, and Alexander A. Khalessi. Epidemiology, natural history, and clinical presentation of large vessel
ischemic stroke. Neurosurgery, 85(suppl 1):S4{S8, 2019.
[9] Patrick Lober, Bernhard Stimpel, Christopher Syben, Andreas Maier, Hendrik Ditt, Peter Schramm, Boy
Raczkowski, and Andre Kemmling. Automatic thrombus detection in non-enhanced computed tomography
images in patients with acute ischemic stroke. Visual Computing for Biology and Medicine, 2017.
[10] Aneta Lisowska, Erin Beveridge, Keith Muir, and Ian Poole. Thrombus detection in (ct brain scans using
a convolutional neural network. In Margarida Silveira, Ana Fred, Hugo Gamboa, and Mario Vaz, editors,
Bioimaging, BIOSTEC 2017, pages 24{33. SCITEPRESS – Science and Technology Publications Lda,
Setubal, 2017.
Deep Learning-based motion correction of free-breathing diffusion-weighted imaging in the abdomen
Since diffusion is particularly disturbed in tissues with high cell densities such as tumors, diffusion-weighted imaging (DWI) constitutes an essential tool for the detection and characterization of lesions in modern MRI-based diagnostics. However, despite the great influence and frequent use of DWI, the image quality obtained is still variable, which can lead to false diagnoses or costly follow-up examinations.
A common way to increase the signal-to-noise ratio (SNR) in MR imaging is to repeat the acquisition several times, i.e. use multiple number of excitations (NEX). The final image is then calculated by ordinary averaging. While the single images are relatively unaffected by bulk motion due to the short acquisition time, relative motion between the excitations and subsequent averaging will lead to motion blurring in the final image. One way to mitigate this is to perform prospective gating (also known as triggering) using a respiratory signal. However, triggered acquisitions come at the cost of significantly increased scan time. Retrospective gating (also known as binning) constitutes an alternative approach in which data is acquired continuously and subsequently assigned to discrete motion states. The drawback of this approach is that there is no guarantee that data is collected for a given slice within the target motion state. In previous works, mapping of the images from other motion states onto the target motion state was achieved by using a motion model given by an additional navigator acquisition.
In recent years, deep learning has shown great potential in the field of MRI in a wide variety of applications. The goal of this thesis is the development of a deep learning-based algorithm which performs navigator-free registration of DW images given a respiratory signal only. Missing data for certain motion states as well as the inherently low SNR of DW images constitute the main challenges of this work. Successful completion of this work promises significant improvements in image quality for diffusion-weighted imaging in motion-sensitive body regions such as the abdomen.
Deep Learning-based Pitch Estimation and Comb Filter Construction
Typically a clean speech consists of two components, a locally periodic component and a stochastic component. If a speech signal only has a stochastic component, the difference between the enhanced signal applied with the corresponding ideal ratio mask and the clean speech signal is barely perceivable. However, if a speech has a perfect periodic component, then the enhanced signal applied with the corresponding ideal ratio mask is affected by the inter-harmonic noise.
A comb filter based on the speech signal’s pitch period is able to attenuate noise between the pitch harmonics. Thus, a robust pitch estimate is of fundamental importance. In this work, a deep learning-based method for robust pitch estimation in noisy environments will be investigated.