Index

Cephalometric Landmark Re-annotation and Automatic Detection

Solution to Extend the Field of View of Computed Tomography Using Deep Learning Approaches

Incorporating Time Series Information into Glacier Segmentation and Front Detection using U-Nets in Combination with LSTMs and Multi-Task Learning

This thesis aims at integrating time series information into the static segmentation of glaciers and their
calving fronts in synthetic aperture radar (SAR) image sequences. U-Nets have recently been shown
to provide promising results for glacier (front) segmentation using synthetic aperture radar (SAR)
imagery [1]. However, this approach only incorporates the spatial information in a single image. The
temporal information of complete image sequences, each showing one glacier at different time points,
has not been addressed thus far. To fill this gap two approaches shall be worked on:

approach 1; using Long Short-Term Memory (LSTM) layers in the U-Net architecture:
Recurrent Neural Networks like LSTMs are designed such that information from previous
inputs in a sequence can be stored in a memory and used to ameliorate the prediction for
the current input. The combination of structured LSTMs and Fully Convolutional Networks
(FCNs) showed promising results for joint 4D segmentation of longitudinal MRI [2]. In [3], a
U-Net was successfully combined with a bi-directional convolutional LSTM for aortic image
sequence segmentation outperforming a simple U-Net in segmentation accuracy. In this thesis,
the combination of LSTMs and U-Nets will be tested for glacier segmentation and calving
front detection in SAR image sequences. Moreover, the use of Recurrent layers (RNN), Gated
Recurrent Units (GRU) and bi-directional LSTMS instead of simple LSTMs shall be investigated
as well.
approach 2; Multi-Task Learning (MLT): As the region to be segmented for calving front
detection is a small part of the image, this task shows a severe class-imbalance. To improve its
performance, an MLT approach shall be implemented jointly training glacier segmentation and
calving front detection. Performance enhancement of U-Nets have been observed using stacking
[4] and shared encoding networks [5, 6]. In this thesis, both MLT techniques shall be tested
using U-Nets in combination with LSTMs (see point 1).

The resulting models will be compared quantitatively and qualitatively with the state-of-the-art and
shall be implemented in Keras.

[1] Zhang et al. “Automatically delineating the calving front of Jakobshavn Isbræ from multitemporal
TerraSAR-X images: a deep learning approach.” The Cryosphere 13, no. 6 (2019): 1729-1741.

[2] Gao et al. “Fully convolutional structured LSTM networks for joint 4D medical image segmentation.”
In: IEEE 15th International Symposium on Biomedical Imaging, Washington, DC, 2018, IEEE, pp.
1104-1108.

[3] Bai et al. “Recurrent Neural Networks for Aortic Image Sequence Segmentation with Sparse Annotations.”
In Alejandro F. Frangi, Julia A. Schnabel, Christos Davatzikos, Carlos Alberola-L´opez, Gabor Fichtinger
(Eds.): Medical Image Computing and Computer Assisted Intervention – MICCAI, 2018, pp. 586-594.

[4] Sun et al. “Stacked U-Nets with Multi-Output for Road Extraction.” In: CVPR Workshops, Salt Lake
City, 2018, pp. 202-206.

[5] Ke et al. “Learning to segment microscopy images with lazy labels.” In: ECCV Workshop on BioImage
Computing, 2020.

[6] Lee et al. “Multi-Task Learning U-Net for Single-Channel Speech Enhancement and Mask-Based Voice
Activity Detection.” Applied Sciences 10, no. 9 (2020): p. 3230.

torchsense – a PyTorch-based Compressed Sensing reconstruction framework for dynamic MRI

In this master thesis a novel deep learning-based reconstruction method specifically tailored for cardiac radial cine MRI image sequences is investigated. Despite the many advantages presented by state-of-the-art unrolled networks, their applicability is limited due to integration of the forward operator into the scheme which poses a computational challenge within the scope of dynamic non-Cartesian MRI. The novelty of our algorithm constitutes the decoupling of regularization and data consistency enforcement into two separate steps that can be combined into an end-to-end reconstruction scheme which reduces the usage of the forward operator and, thereby, offers more flexibility. In contrast to unrolled networks, the regularization step will be achieved by a lightweight denoising CNN, in some cases leading to a closed-form solution of the data-consistency step.

Utilizing the flexibility (e.g., variable network length at test time), we will seek to increase the undersampling ratio of the k-space, thereby, allowing a higher temporal resolution using an existing acquisition scheme.

Automatic segmentation of whole heart

Congenital Disease (CD) are defects that exist in newborn babies. Neural tube defects, craniofacial
anomalies, congenital heart diseases (CHD) are some of them and amongst them, Congenital Heart
Diseases are the most common type of anomalies that aect 4 to 50 per 1000 infants based on the
dierence in demographic characteristics and experiment conditions [1].
Medical Image segmentation is one of the most important parts of planning the steps of treatment
for patients with CHD. Image segmentation techniques aim to detect boundaries within a 2D or 3D
image and partition the image into meaningful parts based on pixel level information e.g. intensity
value and spatial information e.g. anatomical knowledge [3]. However, segmentation for a single 3D
medical image might take some hours. In addition to that, the complexity of images and the fact that
understanding these images needs medical expertise make them costly to annotate which makes an
automatic segmentation framework crucial.
Previously an interactive segmentation method is suggested for this purpose [2]. This master
thesis aims to reduce the manual interaction of the users by investigating dierent machine learning
approaches to nd a highly accurate model that could potentially replace the interactive solution.
The thesis has to comprise the following work items:
• Literature overview of state-of-the-art segmentation methods, particularly deep learning meth-
ods, for 3D medical images.
• Implementation and training of dierent deep learning segmentation models.
• Evaluation of trained models based on dice score and comparing them to previous interactive
approaches.
References
[1] Manuel Giraldo-Grueso, Ignacio Zarante, Alejandro Meja-Grueso, and Gloria Gracia. Risk factors
for congenital heart disease: A case-control study. Revista Colombiana de Cardiologa, 27(4):324{
329, 2020.
[2] Danielle F Pace. Image segmentation for highly variable anatomy: applications to congenital heart
disease. PhD thesis, Massachusetts Institute of Technology, 2020.
[3] Felix Renard, Soulaimane Guedria, Noel De Palma, and Nicolas Vuillerme. Variability and repro-
ducibility in deep learning for medical image segmentation. Scientic Reports, 10(1):1{16, 2020.

Automatic Bird Individual Recognition in Multi-Channel Recording Scenarios

Problem background:
At the Max-Planck-Institute for Ornithology in Radolfszell several birds are equipped with
backpacks to record their calls. But not only the sound of the equipped bird is recorded but also
of the birds in its surroundings and as a result the scientists receive several non-synchronous
audio tracks with bird calls. The biologists have to manually match the calls to the individual
birds, which is time-consuming and can easily lead to mistakes.
Goal of the thesis:
The goal of this thesis is to implement a python framework that can assign the calls to the
corresponding birds.
Since the intensity of a call decreases exponentially with distance, the loudest call can be
matched to the bird with this recorder. Also, the call of the mentioned bird appears earlier on
its own recording device than on the other devices.
To assign the further calls to the remaining birds, the soundtracks must be compared by
overlaying the audio signals. For this purpose, the audio signals have to be modified first:
Since different devices are used for capturing data and because the recordings cannot be started
at the same time, a linear time offset between the recordings occurs. Also, a linear distortion
appears as the devices record at different frequencies.
To remove these inconsistencies, similar characteristics must be found in the audio signals and
then the audio tracks have to be shifted and processed until these characteristics lie one above
another. There are several methods to filter out these characteristics, whereby the most precise
methods require human assistance [1]. But there are also some automated approaches, where
the audio track is scanned for periodic signal parameters such as pitch or spectral flatness.
Effective features are essential for the removal of distortion as well as a good ability of the
algorithm to distinguish between minor similarities of the characteristics [2].
The framework will be implemented in Python. It should process the given audio tracks and
recognize and reject disturbed channels.
References:
[1] Brett G. Crockett, Michael J. Smithers. Method for time aligning audio signals using
characterizations based on auditory events, 2002
[2] Jürgen Herre, Eric Allamanche, Oliver Hellmuth. Robust matching of audio signals using
spectral flatness features, 2002

Height Estimation for Patient Tables from Computed Tomography Data

Image Segmentation via Transformers

The recent outburst of Transformers has started after having outperformed previously known stateof-
the-art approaches like long short-term memory and gated recurrent neural networks in sequence
modelling and transduction problems such as language modelling and machine translation. Transformers
avoid recurrence and instead rely entirely on an attention mechanism to draw global dependencies
between input and output [1]. Furthermore, Transformers are now being incorporated and tested out
in domains of computer vision tasks like classification [2], detection [3], segmentation [4] and as
generative adversarial networks (GANs) [5] by considering image-patches to have a sequence-potential.
Transformer Architecture was successfully used to perform object detection which helped drop away
many hand-designed components like a non-maximum suppression procedure or anchor generation
that explicitly encodes our prior knowledge about the task. Subsequently, it was extended for panoptic
segmentation. Although, Transformers used for segmentation did not only exploit the sequencepotential
but typically still used some form of Convolutional Neural Networks (CNNs) along with it.
However, Jiang et al. has proposed a pure Transformer based model in GAN environment (TransGAN)
for image generation ensuring the possibility of dropping CNNs in GANs [5].
In this work, the idea of using image patches as a sequence input into a Transformer model without
CNNs is carried out for segmentation tasks.
The thesis consists of the following milestones:

Modifying TransGAN discriminator and generator as encoder and decoder respectively for
segmentation
Evaluating performance on the Cityscapes dataset [6].
Further experiments and improvements regarding learning and network architecture.
The implementation should be done in PyTorch Lightning.

[1] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser,
and Illia Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on
Neural Information Processing Systems, pages 6000–6010, 2017.

[2] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner,
Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16×16
words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.

[3] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey
Zagoruyko. End-to-end object detection with transformers. In European Conference on Computer Vision,
pages 213–229. Springer, 2020.

[4] Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng
Feng, Tao Xiang, Philip HS Torr, et al. Rethinking semantic segmentation from a sequence-to-sequence
perspective with transformers. arXiv preprint arXiv:2012.15840, 2020.

[5] Yifan Jiang, Shiyu Chang, and Zhangyang Wang. Transgan: Two transformers can make one strong gan.
arXiv preprint arXiv:2102.07074, 2021.

[6] Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson,
Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for semantic urban scene understanding.
In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

Manifold Forests

Random Forests for Manifold Learning

Description: There are many different methods for manifold learning, such as Locally Linear Embedding, MDS, ISOMAP or Laplacian Eigenmaps. All of them use a type of local neighborhood that tries to approximate the relationship of the data locally, and then try to find a lower dimensional representation which preserves this local relationship. One method to learn a partitioning of the feature space is by training a density forest on the data [1]. In this project the goal is to implement a Manifold Forest algorithm that finds a 1-D signal of length N in a series of N input images by learning a density forest on the data and afterwards applying Laplacian Eigenmaps on the data. For this, existing frameworks, like [2], [3], or [4] can be used as forest implementation. The Laplacian Eigenmaps algorithm is already implemented and can be integrated.

The concept of Manifold Forests is also introduced in the FAU lecture Pattern Analysis by Christian Riess, which makes candidates who have already heard this lecture preferred.

This project is intended for students wanting to do a 5 ECTS sized module like a research internship, starting now or asap. The project will be implemented in Python.

References:

[1]: Criminisi, A., Shotton, J., & Konukoglu, E. (2012). Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning. Foundations and Trends® in Computer Graphics and Vision, 7(2–3), 81–227. ; https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/CriminisiForests_FoundTrends_2011.pdf

[2]: https://github.com/CyrilWendl/SIE-Master

[3]: https://github.com/ksanjeevan/randomforest-density-python

[4]: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomTreesEmbedding.html#sklearn.ensemble.RandomTreesEmbedding

Learning Multi-Catheter Reconstructions for Interstitial Breast Brachytherapy

Thesis Description

Female breast cancer accounts for 355.000 new cases among all types of cancer in EU-27 countries in 2020. In Germany alone, approximately 69,000 new cases are diagnosed each year [1]. During the past four decades, breast conserving surgery (BCS) after lumpectomy in combination with radiotherapy (RT) has been most widely accepted as this treatment technique reduces both a patient’s emotional as well as psychological traumata due to superior aesthetic outcome [2]. The standard technique of giving RT after BCS is whole breast irradiation (WBI) where a patient’s entire breast is irradiated up to a total dose of 40 to 50 Gray (Gy). BCS with adjuvant WBI yields evident equivalence in terms of local tumor control compared to mastectomy where the entire breast is amputated. However, approximately 50 % of early breast cancer patients still undergo mastectomy in order to omit either RT at all or 5 to 7 weeks of treatment time [3]. In contrast to external breast irradiation, accelerated partial breast irradiation (APBI) is an emerging standalone post-operative alternative treatment option in brachytherapy [4]. One valid strategy of applying APBI is multi-catheter interstitial brachytherapy (iBT). Thereby, up to 30 highly flexible plastic catheters are implanted into a patient’s breast in order to precisely and locally damage the tumor by guiding a radioactive source through the tissue. In BCT, the radioactive dose is delivered by utilizing a high dose rate (HDR) technique where the prescribed dose is administered with a rate of 12 Gy/h by single source within minutes [5]. This is performed by an afterloading system connected to the catheters via transfer tubes [4, 6]. Sole APBI is not only intended to drastically reduce treatment times to only 4 to 5 days but also to decrease the amount of radiation exposure of adjacent organs at risk (OAR) such as the lung, the skin and, in particular, the heart [7]. After implantation, catheter traces are manually reconstructed based on an acquired computed tomography (CT) image for treatment planning and determining the implant geometry. Then, in the acquired CT of the patient’s breast, physicians precisely define the target volume depending on a tumor’s size and location [6]. While treatment planning, implanted plastic catheters are manually reconstructed slice by slice which takes approximately 45% of the whole treatment time [8]. Along each catheter trajectory dwell positions (DPs) connecting the points in the slices as well as dwell times (DTs) are defined. DPs determine positions where the radioactive source stops for a certain DT, thus irradiation surrounding tumor tissue. Active DPs and DTs are defined at the location of the target volume to optimally deliver prescribed radioactive dose [9]. As treatment plan dosimetry and DP positioning are directly related, accurate and fast catheter trace reconstructions are crucial [4].

However, the manual reconstruction of up to 30 catheter tubes is a time-consuming process. Kallis et al. state that manual reconstructions on average take up to 139 ± 47 seconds(s) per catheter. They also observed an interobserver variability of 0.6 ± 0.35 millimeter (mm) in terms of mean Euclidean distance between two experienced medical physicists and the autoreconstruction approach proposed by [8], thus, yielding reproducible and reliable reconstructions [6]. Similar findings were proven by Milickovic et al. in 2001 [10]. The insufficient amount of ground truth catheter trace positions as well as blurry CT imaging quality make it hard to reliably and accurately reconstruct DPs. Hence, this suggests further research to conducting automated reconstruction approaches [10].

In the last 20 years, mainly two different catheter auto-reconstruction approaches were proposed. Both techniques aim to minimize the error of implant geometries, thus, improve optimal dose coverage as well as drastically reduce reconstruction times. Milickovic et al. developed an automated catheter reconstruction algorithm based on analyzing post-implant CT data [8, 10]. However as stated by Kallis et al., CT based treatment planning in multi-catheter iBT highly depends on image quality. Due to patient movements, artifacts, as well as acquisition noise, automatically extracted DPs have to be corrected by manual intervention which increases reconstruction times [6]. As introduced by Zhou et al. in 2013, electromagnetic tracking (EMT) became a promising alternative compared to CT based auto-reconstruction [11]. Further analysis has proven that EMT is applicable to iBT as this technique of localizing dwell positions in iBT offers sparse, precise, and sufficiently accurate dose calculations [12]. Reducing uncertainties including measurement noise is investigated by postprocessing of sensor data by particle filters. In their work, a mean error of 2.3 mm between clinically approved plan and reconstructed DPs has been reported [13]. Although tracking multi-catheter positions in iBT based on EMT offers imaging artifact independent and fast results, the performance of EMT systems depends heavily on system configurations, e.g. the distance between CT table and patient bed. The error drastically increases from approximately 1 to 4 mm when decreasing the table/bed distance [12].

In recent years, deep learning (DL) has shown to be a powerful technique tackling a variety of computer vision tasks such as medical image analysis. DL based approaches offer highly competitive results in terms of accuracy and efficiency [14, 15]. Deep neural network (DNN) model architectures are able to represent high dimensional non-linear spaces, thus are well suited for the task of automatically reconstructing multi-catheter traces in iBT. Built upon an elegant way of designing DNN architectures – so called Fully Convolutional Networks (FCN) [16] – the UNet architecture has proven to be well suited for image based segmentation tasks as this specific model structure’s output has the same shape as the input [17]. C¸i¸cek et al. developed an extended version of the UNet where all 2D operations are replaces with corresponding 3D ones. This topological modification enables volumetric semantic segmentations [18]. In this Master’s thesis a deep learning based multi-catheter reconstruction method for iBT is presented, investigated, and evaluated using real world breast cancer data from the radiation clinic in Erlangen, Germany. To the best of our knowledge this is the first approach where we introduce artificial intelligence based multi-catheter reconstruction algorithm in breast brachytherapy.

References

Jacques Ferlay et al. Global cancer observatory: Cancer today. https://gco.iarc.fr/today . Accessed: 2021-03-22.
Csaba Polg´ar et al. High-dose-rate brachytherapy alone versus whole breast radiotherapy with or without tumor bed boost after breast-conserving surgery: Seven-year results of a comparative study. International journal of radiation oncology, biology, physics, 60:1173–81, 12 2004.
Vratislav Stranad et al. 5-year results of accelerated partial breast irradiation using sole interstitial multicatheter brachytherapy versus whole-breast irradiation with boost after breast-conserving surgery for low-risk invasive and in-situ carcinoma of the female breast: a randomised, phase 3, non-inferiority trial. The Lancet, 387(10015):229–238, 2016.
Vratislav Strnad, R. P¨otter, G. Kov´acs, and T. Block. Practical Handbook of Brachytherapy. UNI-MED Science. UNI-MED-Verlag, 2010.
Daniela Kauer-Dorner and Daniel Berger. The role of brachytherapy in the treatment of breast cancer. Breast Care, 13, 05 2018.
Karoline Kallis et al. Impact of inter- and intra-observer variabilities of catheter reconstruction on multicatheter interstitial brachytherapy of breast cancer patients. Radiotherapy and Oncology, 135:25–32, 06 2019.
Vratislav Strnad et al. Estro-acrop guideline: Interstitial multi-catheter breast brachytherapy as accelerated partial breast irradiation alone or as boost – gec-estro breast cancer working group practical recommendations. Radiotherapy and Oncology, 128, 04 2018.
Milickovic et al. Catheter autoreconstruction in computed tomography based brachytherapy treatment planning. Medical Physics, 27(5):1047–1057, 2000.
Cheng B. Saw, Leroy J. Korb, Brenda Darnell, K.V. Krishna, and Dennis Ulewicz. Independent technique of verifying high-dose rate (hdr) brachytherapy treatment plans. International Journal of Radiation Oncology*Biology*Physics, 40(3):747–750, 1998.
Natasa Milickovic, Dimos Baltas, and Nikolaos Zamboglou. Automatic reconstruction of catheters in ct based bracytherapy treatment planning. In ISPA 2001. Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis. In conjunction with 23rd International Conference on Information Technology Interfaces (IEEE Cat., pages 202–206, 2001.
Jun Zhou, Evelyn Sebastian, Victor Mangona, and Di Yan. Real-time catheter tracking for high-dose-rate prostate brachytherapy using an electromagnetic 3d-guidance device: A preliminary performance study. Medical Physics, 40(2):021716, 2013.
Markus Kellermeier, Jens Herbolzheimer, Stephan Kreppner, Michael Lotter, Vratislav Strnad, and Christoph Bert. Electromagnetic tracking (emt) technology for improved treatment quality assurance in interstitial brachytherapy. Journal of Applied Clinical Medical Physics, 18:211–222, 01 2017.
Theresa Ida G¨otz et al. A tool to automatically analyze electromagnetic tracking data from high dose rate brachytherapy of breast cancer patients. PLOS ONE, 12(9):1–31, 09 2017.
Florian Kordon et al. Multi-task localization and segmentation for x-ray guided planning in knee surgery. In Dinggang Shen, Tianming Liu, Terry M. Peters, Lawrence H. Staib, Caroline Essert, Sean Zhou, Pew-Thian Yap, and Ali Khan, editors, Medical Image Computing and Computer Assisted Intervention – MICCAI 2019, pages 622–630, Cham, 2019. Springer International Publishing.
Florian Kordon, Ruxandra Lasowski, Benedict Swartman, Jochen Franke, Peter Fischer, and Holger Kunze. Improved x-ray bone segmentation by normalization and augmentation strategies. In Heinz Handels, Thomas M. Deserno, Andreas Maier, Klaus Hermann Maier-Hein, Christoph Palm, and Thomas Tolxdorff, editors, Bildverarbeitung fu¨r die Medizin 2019, pages 104–109, Wiesbaden, 2019. Springer Fachmedien Wiesbaden.
Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. CoRR, abs/1411.4038, 2014.
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi, editors, Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pages 234–241, Cham, 2015. Springer International Publishing.
Ozgu¨n C¸i¸cek, Ahmed Abdulkadir, Soeren S. Lienkamp, Thomas Brox, and Olaf Ronneberger. 3d u-net:¨ Learning dense volumetric segmentation from sparse annotation. CoRR, abs/1606.06650, 2016.