Index
Fetal Re-Identification: Deep Learning on Pregnancy Ultrasound Images
Project description
Accurate analysis of ultrasound images during pregnancy is important for monitoring fetal development and detecting abnormalities. For better accuracy and time convenience, the help of artificial intelligence or accordingly deep learning is useful [1]. However, currently there is less research in deep learning in the field of ultrasound imaging compared to MRI or CT images [2].
Considering its non-invasive nature, lower cost, and lower risk to patients compared to other modalities such as MRI or CT, ultrasound imaging is the most commonly used method to assess fetal development and maternal health [3]. Overall, the correct acquisition of fetal ultrasound data is difficult and time-consuming. Deep Learning can help to reduce examiner dependence, and improve analysis as well as maternal-fetal medicine in general [1].
Although literature on fetal ultrasound imaging in conjunction with deep learning exists [4-6], little previous work investigated fetal re-identification. In multiple pregnancies with fetuses of the same sex or early in pregnancy, the fetuses cannot be distinguished. Therefore, the fetuses are assigned an order at physician descretion (usually based on position in the mother’s womb), although it is not clear whether this order preserves during subsequent examinations. However, this information is important because the risk of fetal abnormalities is greater in multiple pregnancies than in singleton pregnancies [7]. In addition, a good representation of the fetus is also eminent for the emotional connect of the parents [8].
Consequently, the aim of this thesis is an early feasibility investigation of re-identification approaches in fetal ultrasound.
References
[1] J. Weichert, A. Rody, and M. Gembicki. Zukünftige Bildanalyse mit Hilfe automatisierter Algorithmen. Springer Medizin Verlag GmbH, 2020.
[2] Xavier P. Burgos-Artizzu, David Coronado-Guiérrez, Brenda Valenzuela-Alcaraz, Elisenda Bonet-Carne, Elisenda Eixarch, Fatima Crispi, and Eduard Gratacos. Evaluation of deep convolutional neural networks for automatic classification of common maternal fetal ultrasound planes. Nature Scientific Reports, 2020.
[3] D. Selvathi and R. Chandralekha. Fetal biometric based abnormality detection during prenatal development using deep learning techniques. Springer, 2021.
[4] Jan Weichert, Amrei Welp, Jann Lennard Scharf, Christoph Dracopoulos, Achim Rody, and Michael Gembicki. Künstliche Intelligenz in der pränatalen kardialen Diagnostik. page 10, 2021.
[5] Christian F. Baumgartner, Konstantinos Kamnitsas, Jacqueline Matthew, Tara P. Fletcher, Sandra Smith, Lisa M. Koch, Bernhard Kainz, and Daniel Rueckert. SonoNet: Real-Time Detection and Localisation of Fetal Standard Scan Planes in Freehand Ultrasound. page 12, 2017.
[6] Juan C. Prieto, Hina Shah, Alan J. Rosenbaum, Xiaoning Jiang, Patrick Musonda, Joan T. Price, Elizabeth M. Stringer, Bellington Vwalika, David M. Stamilio, and Jeffrey S. A. Stringer. An automated framework for image classification and segmentation of fetal ultrasound images for gestational age estimation. page 11, 2021.
[7] R Townsend and A Khalil. Ultrasound surveillance in twin pregnancy: An update for practitioners. Ultrasound, 2018.
[8] Tejal Singh, Srinivas Rao Kudavelly, and Venkata Suryanarayana. Deep Learning Based Fetal Face Detection And Visualization In Prenatal Ultrasound. 2021.
Projection Domain Metal Segmentation with Epipolar Consistency using Known Operator Learning
Evaluation of a Pixel-wise Regression Model Solving a Segmentation Task and a Deep Learning Model with the Matthew’s Correlation Coefficient as an Early Stopping Criterion
Evaluation of a Pixel-wise Regression Model Solving a Segmentation Task and a Deep Learning Model with the Matthew’s Correlation Coefficient as an Early Stopping Criterion
With global sea level rising and mass loss of polar ice sheets as the main cause, it becomes increasingly important to enchance ice dynamics modeling. A very fundamental information for this is the calving front position (CFP) of glaciers. Traditionally the delineating of the CFP has been done manually, which is a very subjective, tedious and expensive task. Since then, there has been a lot of development in automating this process. Gourmelon et al. [1] introduce the first publicly available benchmark dataset for calving front delineation on synthetic aperture radar (SAR) imagery dubbed CaFFe. The dataset consists of the SAR imagery and two corresponding labels: one showing the calving front vs the background and the other showing different landscape regions. However, for this paper we will only look at methods using the former. As there are many different approaches to calving front delineation the question of what method provides the best performance arises. Subsequently, the aim of this thesis is to evaluate the codes of the following two papers [2],[3] on the CaFFe benchmark dataset and compare their performance with the baselines provided by Gourmelon et al. [1].
- paper 1: Davari et al. [2] reformulates the segmentation problem into a pixel-wise regression task by using a Convolutional Neural Network (CNN) that gets optimized to predict a distance map containing a distance value for each pixel of the image to extract the glacier calving front line with the help of a second U-net.
- paper 2: Davari et al. [3] proposes a deep learning model with the Mathew Correlation Coefficient as an early stopping criterion to counter the extreme class imbalance of this problem. Moreover, a distance map based binary cross-entropy (BCE) loss function gets introduced to add context about the important regions for segmentation. To make a fair and reasonable comparison, the hyperparameters of each model will be optimized on the CaFFe benchmark dataset and the model weights will be re-trained on CaFFe’s train set. The evaluation will be conducted on the provided test set and the metrics introduced in Gourmelon et al. [1] will be used for the comparison.
References
[1] Gourmelon, N.; Seehaus, T.; Braun, M.; Maier, A.; and Christlein, V.: Calving Fronts and Where to Find Them: A Benchmark Dataset and Methodology for Automatic Glacier Calving Front Extraction from SAR Imagery, Earth Syst. Sci. Data Discuss. [preprint]. 2022, https://doi.org/10.5194/essd-2022-139, in review.
[2] A. Davari, C. Baller, T. Seehaus, M. Braun, A. Maier and V. Christlein, “Pixelwise Distance Regression for Glacier Calving Front Detection and Segmentation,” in IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-10, 2022, Art no. 5224610, doi: 10.1109/TGRS.2022.3158591.
[3] A. Davari et al., “On Mathews Correlation Coefficient and Improved Distance Map Loss for Automatic Glacier Calving Front Segmentation in SAR Imagery,” in IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-12, 2022, Art no. 5213212, doi: 10.1109/TGRS.2021.3115883.
Tomographic Projection Selection with Quantum Annealing
Introduction
The object of interest in computed tomography (CT) is exposed to X-rays from multiple angles. The radiation intensity measured by the detector opposite the radiation source then depends on the object’s density. Volumetric information about the object can be reconstructed using many such projection images. Another way to obtain projection data for reconstruction is single-photon emission computed tomography (SPECT), where a radioisotope is injected into the object, and the gamma rays emitted by radioactive decay are measured. The two methods have extensive applications in radiology but are restricted due to the harmful radiation emitted, which can damage cells in the human body [1].
There is a strong interest in performing the reconstruction task with a small number of projection images to limit the patient’s radiation exposure. Determining an optimal set of angles for projection data acquisition is referred to as projection selection. This set shall contain as few angles as possible while allowing a satisfactory reconstruction of the original object. Suppose some a priori information is available, e.g., in discrete tomography, where the object is known to consist of only a few materials with known densities. In that case, this can be used to improve projection selection algorithms. Some of these algorithms are compared in [2]. In particular, simulated annealing (SA) was proposed as a possible method for projection selection.
SA is a minimization method that allows a worsening of the current solution with some probability based on the slowly decreasing temperature of the system. The annealing process mimics the cooling of a material, which terminates in its lowest-energy state. Since a worsening of the current solution is accepted, the solution can not be “trapped” in a sub-optimal local minimum, as can happen with gradient-descent methods. A realistic annealing technique based on superconducting qubits is quantum annealing (QA). Quantum annealing is a quantum computing technique where quantum effects like superposition, entanglement, and tunneling can help traverse the barriers between local minima.
Methods
Starting from the SA formulation of the projection selection problem proposed in [2], a mathematical formulation as a quadratic unconstrained binary optimization (QUBO) problem will be given. The QUBO formulation can then be used to develop a program for the D-Wave quantum annealer, which will be run using simulation software. The discrete algebraic reconstruction technique (DART) will reconstruct the image from the selected projections. Using the images reconstructed by DART, the projection selection with QA can be compared to other projection selection algorithms.
Expected Results
Various iterative reconstruction methods are reviewed. In particular, a python implementation of the DART algorithm is provided, as it can perform an accurate reconstruction even from a small number of projections [3]. Furthermore, the projection selection problem in discrete tomography is formulated as a QUBO problem. This formulation will evaluate the possibility of running the projection selection problem using simulation software and a D-Wave quantum annealer.
[1] A. Maier, S. Steidl, V. Christlein, and J. Hornegger, “Medical imaging systems: An introductory guide,” 2018.
[2] L. Varga, P. Bal ́azs, and A. Nagy, “Projection selection algorithms for discrete tomography,” in Advanced Concepts
for Intelligent Vision Systems (J. Blanc-Talon, D. Bone, W. Philips, D. Popescu, and P. Scheunders, eds.), (Berlin,
Heidelberg), pp. 390–401, Springer Berlin Heidelberg, 2010.
[3] K. J. Batenburg and J. Sijbers, “Dart: A practical reconstruction algorithm for discrete tomography,” IEEE
Transactions on Image Processing, vol. 20, no. 9, pp. 2542–2553, 2011.
Convolutional LSTM for Multi-organ Segmentation on CT and MR Images in Abdominal Region
Automatic Pathological Speech Intelligibility Assessment Using Speech Disentanglement Without Bottleneck Information
Optimizing the Preprocessing Pipeline for “virtual Dynamic Contrast Enhancement” in Breast MRI
Semi-supervised learning for multi-modal bone segmentation
Since AlexNet won the ImageNet Challenge by a wide margin in 2012, the popularity of deep learning has been steadily increasing. In the last years, a technique that has been especially popular is semantic segmentation, as it is used in self-driving cars and medical image analysis. A big challenge that arises when training neural networks (NN) for this task is the acquisition of adequate segmentation masks, because the labeling has often times to be performed by industry experts and is very time consuming. Resulting from that, solutions circumventing this problem had to be found. A popular solution for this task is semi-supervised learning, where only a certain amount of the data is annotated. This approach has the obvious advantage of reducing the time needed for the data acquisition process, but NNs trained this way still have a worse performance compared to ones that were trained fully-supervised.
A common disease affecting one in three women and one in twelve men is osteoporosis. It’s symptoms include low bone mass and a deterioration of bone tissue, leading to an increased fracture risk. The malady affects especially elderly people and for their protection, providing diagnostic tools and suitable treatments is important [1]. Structures that can be found in the bone include lacunae containing osteocytes and trans-cortical vessels (TCV). Murine and human tibia consists of two parts; the inner trabecular bone and the outer cortical bone, where TVCs can be found. To study them and their importance for the development of osteoporosis, we are trying to automatically segment the cortical bone from the surrounding tissue. Additionally, we will attempt to build a NN for the detection of TVCs and lacunae.
We want to achieve this using a model based on convolutional neural networks (CNN) for semantic segmentation. Similar tasks have already been performed [2], but our approach differs as we try to use as few labels as possible for the training process. Methods we want to incorporate are pre-training and the use of image transformations to make the most out of a limited amount of segmentation masks. If those approaches do not yield the desired results, we will also try to incorporate techniques of weakly- and self-supervised learning.
In detail, the thesis will consist of the following parts:
• implementation of multiple CNN-based architectures [3][4] to find a suitable model for our task,
• optimization of this model using different approaches,
• evaluation of the usefulness of pre-training and different semi-supervised learning techniques,
• integration of different techniques to increase the accuracy
References
[1] S P Tuck and R M Francis. Osteoporosis. Postgraduate Medical Journal, 78(923):526–532, 2002.
[2] Oliver Aust, Mareike Thies, DanielaWeidner, FabianWagner, Sabrina Pechmann, Leonid Mill, Darja Andreev, Ippei Miyagawa, Gerhard Kronke, Silke Christiansen, Stefan Uderhardt, Andreas Maier, and Anika Gruneboom. Tibia cortical bone segmentation in micro-ct and x-ray microscopy data using a single neural network. In Klaus Maier-Hein, Thomas M. Deserno, Heinz Handels, Andreas Maier, Christoph Palm, and Thomas Tolxdorff, editors, Bildverarbeitung fur die Medizin 2022 , pages 333–338, Wiesbaden, 2022. Springer Fachmedien Wiesbaden.
[3] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. CoRR, abs/1505.04597, 2015.
[4] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. CoRR, abs/1411.4038, 2014.
[5] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alche-Buc, E. Fox, and R. Garnett, editors, ´ Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019
Evaluation of a Modified U-Net with Dropout and a Multi-Task Model for Glacier Calving Front Segmentation
Evaluation of a Modified U-Net with Dropout and a Multi-Task Model for Glacier Calving Front Segmentation
With global temperatures rising, the tracking and prediction of glacier changes become more and more relevant. Part of these efforts is the development of Neural Network Algorithms to automatically detect calving fronts of marine-terminating glaciers. Gourmelon et al. [1] introduce the first publicly available benchmark dataset for calving front delineation on synthetic aperture radar (SAR) imagery dubbed CaFFe. The dataset consists of the SAR imagery and two corresponding labels: one showing the calving front vs the background and the other showing different landscape regions. Moreover, the paper provides two deep learning models as baselines, one for each label. As there are many different approaches to calving front delineation the question of what method provides the best performance arises. Subsequently, the aim of this thesis is to evaluate the codes of the following two papers [2],[3] on the CaFFe benchmark dataset and compare their performance with the baselines provided by Gourmelon et al. [1].
- paper 1:
Mohajerani et al. [2] employs a Convolutional Neural Network (CNN) with a modified U-Net architecture that also incorporates additional dropout layers. The CNN uses, in contrast to Gourmelon et al. [1], optical imagery as its input.
- paper 2:
Heidler et al. [3] introduces a deep learning model for coastline detection, which combines the two tasks of segmenting water and land and binary coastline delineation into one cohesive multi-task deep learning model.
To make a fair and reasonable comparison, the hyperparameters of each model will be optimized on the CaFFe benchmark dataset and the model weights will be re-trained on CaFFe’s train set. The evaluation will be conducted on the provided test set and the metrics introduced in Gourmelon et al. [1] will be used for the comparison.
References
[1] Gourmelon, N.; Seehaus, T.; Braun, M.; Maier, A.; and Christlein, V.: Calving Fronts and Where to Find Them: A Benchmark Dataset and Methodology for Automatic Glacier Calving Front Extraction from SAR Imagery, Earth Syst. Sci. Data Discuss. [preprint]. 2022, https://doi.org/10.5194/essd-2022-139, in review.
[2] Mohajerani, Y.; Wood, M.; Velicogna, I.; and Rignot, E.: Detection of Glacier Calving Margins with Convolutional Neural Networks: A Case Study, Remote Sens. 2019, 11, 74. https://doi.org/10.3390/rs11010074
[3] Heidler, K.; Mou, L.; Baumhoer, C.; Dietz A.; and Zhu, X.: HED-UNet: Combined Segmentation and Edge Detection for Monitoring the Antarctic Coastline, IEEE Transactions on Geoscience and Remote Sensing. 2022, vol. 60, 1-14, Art no. 4300514, doi: 10.1109/TGRS.2021.3064606.
Animal-Independent Signal Enhancement Using Deep Learning
Examining and segmenting bioacoustic signals is an essential part of biology. For example, by analysing orca calls it is possible to draw several conclusions regarding the animals’ communication and social behavior.1 However, to avoid having to manually go through hours of audio material to detect those calls, the so-called ORCA-SPOT toolkit was developed, which uses Deep Learning to separate relevant signals from pure ambient sounds.2 These may nevertheless still contain background noise, which makes the examination rather difficult. To remove this background noise, ORCA-CLEAN was developed. Again, using a Deep Learning approach by adapting the Noise2Noise concept, as well as using machine-generated binary masks as an additional attention mechanism, the orca calls are denoised as best as possible without requiring clean data as foundation.3
But, as mentioned, this toolkit is optimized for the denoising of orca calls. Marine biologists are of course not the only ones who require clean audio signals for their research. Ornithologists alone deal with a great variety of different noise. One that studies urban bird species would like city sounds to be filtered from his audio samples, whereas one who works with tropical birds rather wants his recordings clean from forest noise. One could argue that almost every biologist who analyses recordings of animal calls would have use for a denoising tool kit.
Another task where audio denoising is of great relevance is when interpreting and processing human speech. It can be used to improve the sound quality of a phone call or a video conference, to preprocess a voice command to a virtual assistant on a smartphone, to improve voice recognition software and many other examples. Even in medicine it can help when analysing pulmonary auscultative signals, which is the key method to detect and evaluate respiratory dysfunctions.4
It therefore makes sense to generalize ORCA-CLEAN and to make it trainable for other animal sounds, perhaps even human speech, or body noises. One would then have a generalized version of ORCA-CLEAN, which can then be trained according to the desired purpose. The goal of this thesis will be to describe and explain the respective changes in the code, as well as to evaluate how differently trained models perform on audio recordings of different animals. The transfer from a model specialized on orcas to one specialized on another animal species will be demonstrated using recordings of hyraxes. The data used contains tapes of 34 hyrax individuals. For each individual there are multiple tapes available, and for each tape there is a corresponding table containing information like the exact location, the length, the peak frequency or the call type for each call on the tape.
The hyrax is a small hoofed mammal of the family of Procaviidae.5, 6 They usually weigh 4 to 5 kg, are about 30 to 50 cm long and are mostly herbivorous.5 Their calls, especially the advertisement calls, are helpful for distinguishing different hyrax species and for analysing the animals’ behaviour.6
Here are a few rough approaches how I would realize this thesis. I would begin by modifying the ORCA-CLEAN code. Since orca calls very much differ from hyrax ones in terms of frequency range as well as in length, the prepocessing of the audio tapes would have to be modified. I would also like to add some more input/output spectrogram variants to the training process.
One could use pairs of noisy and denoised human speech samples for example, or a pure noise spectrogram versus a completely empty one. The probability with which each of these variants is chosen could additionally be made variable.
After that, I would train different models with hyrax audio tapes, including original ORCA-CLEAN as well as an the newly created adaptions, and evaluate their performance. Since the provided hyrax tapes aren’t all equally noisy, they can be sorted by the so-called Signal to Noise Ratio (SNR). One can then compare these values before and after denoising, e.g., by correlating them, and check if the files were denoised correctly or if relevant parts were removed.
With help of these results further alterations can be made, for example by changing the probabilities of the training methods or by adapting the hyperparameters of the deep network, until hopefully in the end, a suitable network which doesn’t require huge amount of data is the result.
I hope I was able to give some insight into what I imagine the subject to be, and how I would roughly execute it.
Sources
[1] https://lme.tf.fau.de/person/bergler/#collapse_0
[2] C. Bergler, H. Schröter, R. X. Cheng, V. Barth, M. Weber, E. Nöth, H. Hofer, and A. Maier, “ORCA-SPOT: An Automatic Killer Whale Sound Detection Toolkit Using Deep Learning” Scientific Reports, vol. 9, 12 2019.
[3] C. Bergler, M. Schmitt, A. Maier, S. Smeele, V. Barth, and E. Nöth, “ORCA-CLEAN: A Deep Denoising Toolkit for Killer Whale Communication“ Interspeech 2020 (pp. 1136-1140). International Speech Communication Association.
[4] F. Jin and F. Sattar “Enhancement of Recorded Respiratory Sound Using Signal Processing Techniques“ In A. Cartelli, M. Palma (Eds.) “Encyclopedia of Information Communication Technology” (pp. 291-300), 2009.
[5] https://www.britannica.com/animal/hyrax
[6] https://www.wildsolutions.nl/vocal-profiles/hyrax-vocalizations/