Index

Quantum Machine Learning Techniques in Medical Image Classification: Simulation and Hardware

Data Encoding, Parameterization and Generalization of Quantum Machine Learning for Medical Imaging

Binary Mask Generation for Killer Whale Vocalizations

Geometry-Aware Key-Point / Object Detection and Pose-Estimation

For a wide range of emerging applications an increasing demand for reliable and accurate object detection and pose estimation, using machine learning based systems arises. This is particularly the case for autonomous systems such as autonomous vehicles and robotics but also in the context of augmented reality [1]. These applications require detecting and locating objects in real-time and in various environments, including cluttered scenes and objects with similar appearances.

However, traditional object detection and pose estimation methods often can only partially detect and locate objects in these challenging situations, leading to inaccurate and unreliable results [2]. This is where Geometry-Aware Key-Point / Object Detection and Pose-Estimation comes in, as it aims to explicitly incorporate additional geometric information into these tasks to improve their accuracy and robustness. In object detection, the goal is to identify the presence and location of objects within an image or video. Pose estimation, on the other hand, refers to estimating the position and orientation of objects in 3D space based on 2D images. By including human domain knowledge in the form of geometric constraints, we would like to utilize the knowledge of domain experts to create more robust and accurate solutions by simultaneously reducing the labeling effort associated with training data-driven solutions for novel applications.

There are various approaches to incorporating geometric information into object detection and pose estimation tasks. One common approach is to use geometry-aware convolutional neural networks (Geo-CNNs) [4], which are designed to incorporate geometric information into the model architecture explicitly. Another approach is to use geometry-aware scene graph generation [5], which uses a graph-based representation to model the geometric relationships between objects in a scene. However, our approach depends on the task at hand, object shape and orientation variability, scene complexity, and we would like to utilize the knowledge of domain experts to create more robust and accurate solutions by simultaneously reducing the labeling effort associated with training data-driven solutions for novel applications. An assessment of existing methods according to those requirements is part of the literature review corresponding to the proposed work. Afterwards, a potential adaptation of an existing method or the design and implementation of a novel approach and the corresponding evaluation should be the central task of the work.

Evaluation will be performed on industrial object detection use-case with high requirements on robustness and performance. The use-case considered for evaluating the proposed method is given by the detection of pallets in the context of an autonomous pallet unloading application. For this work it is planned to start from a public data set [7] and afterwards try to transfer results to our use-case and data, which is at least partially already collected. The thesis shall be carried out within a time period of six months including the literature review.

[1] Realtime 3D Object Detection for Automated Driving Using Stereo Vision and Semantic Information
[2] Viewpoint-Independent Object Class Detection using 3D Feature Maps
[3] Unsupervised 3D Pose Estimation With Geometric Self-Supervision
[4] Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN
[5] A Comprehensive Survey of Scene Graphs: Generation and Application
[6] From Points to Parts
[7] GitHub – tum-fml/loco: Home of LOCO, the first scene understanding dataset for logistics.
[8] Nothing But Geometric Constraints
[9] DeepIM: Deep Iterative Matching for 6D Pose Estimation

Cross-Dataset Phonological Speech Analysis of Children with Cleft Lip and Palate

Classification of industrial parts using synthetic data from CAD models

Deep Learning-based Balloon Marker Detection from Angiography Data

Thesis Description
Coronary Artery Disease (CAD) is one the most predominant contributors to cardiovascular disease
that stands as the major cause of death globally [1]. Usually, it manifests itself as narrowed or blocked
arteries caused by plaque buildup, a condition known as Atherosclerosis. Percutaneous Coronary In-
tervention (PCI) is a frequently used treatment for CAD in which the narrowed arteries are widened.
A typical type of PCI is revascularization using angioplasty with a stent [2]. Generally, as part of this
treatment, a thin flexible tube is inserted into the femoral artery through the groin. Once the tip is
properly positioned in the blockage site, a balloon which is surrounded by a stent graft is inflated to
compress the plaque against the arterial walls. After the procedure is completed and the balloon is
removed, the stent keeps the artery open and supports the blood flow.
Real-time 2D X-ray projections serve as the guidance method of choice for catheter-based interventions
like PCI to help the physicians visually determine the position and extent of the stents [3]. However,
visualizing stents in conventional X-ray images is challenging because of their low radio-opacity. Hence,
digital stent enhancement (DSE) methods have been developed to enhance the stent visibility in X-ray
image sequences [4]. Angioplasty balloons often incorporate two highly radio-opaque markers [5]. De-
tecting and tracking these markers enables DSE methods by registering all frames within the sequence
followed by a mean intensity projection [6]. Therefore, an accurate and robust detection of the stent
markers is a crucial component of all DSE methods. This task is usually performed automatically
using machine learning (ML). An additional challenge in this domain is that due to the high frame
rates of up to 30 frames per second the marker detection needs high computational efficiency.
Possible ML techniques for this task include landmark detection and object detection approaches.
These play a crucial role in the computer vision area, particularly in the field of medical image pro-
cessing [7]. They facilitate the identification of anatomical features, recognition and precise localization
of pathological conditions, and the accurate delineation of structures of interest within medical im-
ages [8]. A few of these methodologies have been employed for balloon marker detection and stent
localization [9–15]. Some approaches rely on conventional ML approaches such as template-based-
matching [14] and adaptive thresholding [9]. On the other hand, the continuous progress in deep
learning offers promising potential for further advancements in stent localization. U-net, a widely
adopted CNN architecture in medical image segmentation, has been employed as the backbone of
several state-of-the-art models to segment the catheter shaft [12] or generate markers heatmap by
treating each landmark as a 2D Gaussian distribution [15]. Although these approaches have improved
stent visualization significantly, they primarily focus on detecting and tracking single balloon marker
pairs in each frame. To address the challenge of stabilizing multiple stents, some researchers have
employed object detection methods through guidewire endpoint localization employing an extended
variant of the Faster R-CNN model [10, 11]. However, detecting objects with common architectures
like R-CNN family models is computationally demanding. Since real-time performance is crucial for
providing instant information about the position of stents to physicians, faster models like YOLO [16]
are needed to accelerate the process. Therefore, a model based on YOLOv3 is proposed that meets
the requirement for real-time guidewire detection and endpoint localization [13].

This thesis aims to investigate a set of research questions with respect to real-time models for detecting
multiple balloon markers in fluoroscopic images. Firstly, the most promising approaches from the
literature need to be compared, also with the most recent developments in the field included. These
include the object detection networks of the YOLO family [17] as well as the heatmap-based point
regression approaches.
Secondly, the pre-processing of the fluoroscopic images can vary to a large extent as multiple algorithms
with a plethora of parameters can be altered. Therefore, it is of high interest to evaluate which influence
different pre-processing parameterization has on the performance of a marker detector.
Finally, the training of a robust network requires a large collection of data covering a large variation
of potential physical influences. To mitigate the need to collect a large amount of clinical data, an
evaluation of whether the use of suitable phantom data is sufficient shall be conducted.
The thesis will comprise the following work items:
Literature overview of state-of-the-art automated landmark and object detection approaches
Balloon marker detection
Data annotation and preprocessing
Train a network of the YOLO-family on phantom data
Train a U-net model for heatmap regression
Possibly: post-processing
Analysis of the Deep Learning models
Evaluate the effect of various pre-processing parameterizations on the marker detector
performance
Evaluate and compare the performance of implemented algorithms on both phantom and
clinical data

References
[1] Gregory A Roth, George A Mensah, Catherine O Johnson, Giovanni Addolorato, Enrico Ammi-
rati, et al. Global burden of cardiovascular diseases and risk factors, 1990–2019: update from the
GBD 2019 study. Journal of the American College of Cardiology, 76(25):2982–3021, 2020.
[2] Javaid Iqbal, Julian Gunn, and Patrick W Serruys. Coronary stents: historical development,
current status and future directions. British Medical Bulletin, 106(1), 2013.
[3] Ardit Ramadani, Mai Bui, Thomas Wendler, Heribert Schunkert, Peter Ewert, et al. A survey of
catheter tracking concepts and methodologies. Medical Image Analysis, page 102584, 2022.
[4] Vincent Bismuth, R ́egis Vaillant, Fran ̧cois Funck, Niels Guillard, and Laurent Najman. A com-
prehensive study of stent visualization enhancement in X-ray images by image processing means.
Medical Image Analysis, 15(4):565–576, 2011.
[5] Robert A Close, Craig K Abbey, and James Stuart Whiting. Improved image guidance of coronary
stent deployment. In Medical Imaging 2000: Image Display and Visualization, volume 3976, pages
301–304. SPIE, 2000.
[6] Kyle McBeath, Krishnaraj Rathod, Matthew Cadd, Anne-Marie Beirne, Oliver Guttmann, et al.
Use of enhanced stent visualisation compared to angiography alone to guide percutaneous coro-
nary intervention. International Journal of Cardiology, 321:24–29, 2020.
[7] Zhuoling Li, Minghui Dong, Shiping Wen, Xiang Hu, Pan Zhou, et al. CLU-CNNs: Object
detection for medical images. Neurocomputing, 350:53–59, 2019.
[8] Andreas Maier, Christopher Syben, Tobias Lasser, and Christian Riess. A gentle introduction
to deep learning in medical image processing. Zeitschrift f ̈ur Medizinische Physik, 29(2):86–101,
2019.
[9] Negar Chabi, Oliver Beuing, Bernhard Preim, and Sylvia Saalfeld. Automatic stent and catheter
marker detection in X-ray fluoroscopy using adaptive thresholding and classification. In Current
Directions in Biomedical Engineering, volume 6. De Gruyter, 2020.
[10] Xiaolu Jiang, Yanqiu Zeng, Shixiao Xiao, Shaojie He, Caizhi Ye, et al. Automatic detection of
coronary metallic stent struts based on YOLOv3 and R-FCN. Computational and Mathematical
Methods in Medicine, 2020, 2020.
[11] Rui-Qi Li, Xiao-Liang Xie, Xiao-Hu Zhou, Shi-Qi Liu, Zhen-Liang Ni, et al. A unified framework
for multi-guidewire endpoint localization in fluoroscopy images. IEEE Transactions on Biomedical
Engineering, 69(4):1406–1416, 2021.
[12] Ina Vernikouskaya, Dagmar Bertsche, Tillman Dahme, and Volker Rasche. Cryo-balloon catheter
localization in X-ray fluoroscopy using U-Net. International Journal of Computer Assisted Radi-
ology and Surgery, 16:1255–1262, 2021.
[13] Rui-Qi Li, Xiao-Liang Xie, Xiao-Hu Zhou, Shi-Qi Liu, Zhen-Liang Ni, et al. Real-time multi-
guidewire endpoint localization in fluoroscopy images. IEEE Transactions on Medical Imaging,
40(8):2002–2014, 2021.
[14] Ahmed G Kotb, Ahmed M Mahmoud, and Muhammad A Rushdi. Template-based balloon-
marker and guidewire detection for coronary stents in cardiac fluoroscopy. In 2022 44th Annual
International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages
2199–2202. IEEE, 2022.
[15] Luojie Huang, Yikang Liu, Li Chen, Eric Z Chen, Xiao Chen, et al. Robust landmark-based
stent tracking in X-ray fluoroscopy. In European Conference on Computer Vision, pages 201–216.
Springer, 2022.
[16] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified,
real-time object detection. In Proceedings of the IEEE Conference On Computer Vision and
Pattern Recognition, pages 779–788. IEEE, 2016.
[17] Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. YOLOv7: Trainable bag-of-
freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, pages 7464–7475. IEEE, 2023

Generalizable X-Ray View Synthesis

Machine Learning Based Optimization of Material Decomposition in Multi-Spectral Computed Tomography

Creation of a Workflow Troubleshoot Companion for Magnetic Resonance Imaging Systems using Large Language Models