Index

Unsupervised Super Resolution in X-ray Microscopy Using a Cycle-Consistent Generative Model

Automatic Rotation of Spinal X-Ray Images

Guidance in orthopedic and trauma surgery is increasingly relying on intraoperative fluoroscopy with a mobile C-arm. Mobile fluoroscopy is also used to assess the success of fracture reduction, implant position, and overall outcome [1]. This reduces the number of necessary revision surgeries [2]. Accurate standardized image rotation is essential to improve reading performance and interpretation. While alignment of a patient with the imaging system is not always achievable, the images must be rotated manually by radiographers [3].

As the interaction of the user with the imaging system should be minimized, the goal of this thesis is to develop an automatic procedure that determines the orientation of the acquired images and rotates them to a standard position to be viewed by radiologists. In this work, the focus is on the regression of the image rotation of anterior-posterior (AP) and lateral radiographs of the spine since these are the most frequently acquired and most relevant for spine procedures.

 

[1] Lisa Kausch, Sarina Thomas, Holger Kunze, Maxim Privalov, Sven Vetter, Jochen Franke, Andreas H.
Mahnken, Lena Maier-Hein, and Klaus Maier-Hein. Toward automatic c-arm positioning for standard
projections in orthopedic surgery. Int. J. Comput. Assist. Radiol. Surg., 15(7):1095–1105, Jul 2020.

[2] Celia Mart´ın Vicario, Florian Kordon, Felix Denzinger, MarkusWeiten, Sarina Thomas, Lisa Kausch, Jochen
Franke, Holger Keil, Andreas Maier, and Holger Kunze. Automatic plane adjustment of orthopedic intraoperative
flat panel detector ct-volumes. In Proc. MICCAI, Part II, volume 12262, pages 486–495, 2020.

[3] Ivo M. Baltruschat, Axel Saalbach, Mattias P. Heinrich, Hannes Nickisch, and Sascha Jockel. Orientation
regression in hand radiographs: a transfer learning approach. In Proc. SPIE Medical Imaging, volume 10574,
pages 473 – 480, 2018.

Writer Verification/Identification using SuperPoint and SuperGlue

Evaluation of an Attention U-Net for Glacier Segmentation

Evaluation of an Optimized U-Net for Glacier Segmentation

Evaluation of a Bayesian U-Net for Glacier Segmentation

Guided Attention Mechanism for Weakly-Supervised Breast Calcification Analysis

Thesis_Proposal_Akshat_Submitted

Self-supervised learning for pathology classification

Motivation
Self-supervised learning is a promising approach in the field of speech processing. The capacity to learn
representations from unlabelled data with minimal feature-engineering efforts results in increased
independence from labelled data. This is particularly relevant in the pathological speech domain, where the
amount of labelled data is limited. However, as most research focuses on healthy speech, the effect of selfsupervised
learning on pathological speech data remains under-researched. This motivates the current
research as pathological speech will potentially benefit from the self-supervised learning approach.
Proposed Method
Self-supervised machine learning helps make the most out of unlabeled data for training a model. Wav2vec
2.0 will be used, an algorithm that almost exclusively uses raw, unlabeled audio to train speech
representations [1][2]. These can be used as input feature alternatives to traditional approaches using Mel-
Frequency Cepstral Coefficients or log-mel filterbanks for numerous downstream tasks. To evaluate the
performance of these trained representations, it will be examined how well they perform on a binary
classification task where the model predicts whether or not the input speech is pathological.
A novel database containing audio files in German collected using the PEAKS software [3] will be used.
Here, patients with speech disorders, such as dementia, cleft lip, and Alzheimer’s Disease, were recorded
performing two different speech tasks: picture reading in !!Psycho-Linguistische Analyse Kindlicher Sprech-
Störungen” (PLAKSS) and “The North Wind and the Sun” (Northwind) [3]. As the database is still being
revised, some pre-processing of the data must be performed, for example, removing the voice of a (healthy)
therapist from the otherwise pathological recordings. After preprocessing, the data will be input to the
wav2vec 2.0 framework for self-supervised learning, which will be used as a pre-trained model in the
pathology classification task.
Hypothesis
Given the benefits of acquiring learned representations without labelled data, the hypothesis is that the selfsupervised
model’s classification experiment will outperform the approach without self-supervision. The
results of the pathological speech detection downstream task are expected to show the positive effects of
pre-trained representations obtained by self-supervised learning.
Furthermore, the model is expected to enable automatic self-assessment for the patients using minimallyinvasive
methods and assist therapists by providing objective measures in their diagnosis.
Supervisions
Professor Dr. Andreas Maier, Professor Dr. Seung Hee Yang, M. Sc. Tobias Weise
References
[1] Schneider, S., Baevski, A., Collobert, R., Auli, M. (2019) wav2vec: Unsupervised Pre-Training for
Speech Recognition. Proc. Interspeech 2019, 3465-3469
[2] A. Baevski, Y. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A Framework for Self-Supervised
Learning of Speech Representations,” in Advances in Neural Information Processing Systems. 2020, vol. 33,
pp. 12449–12460, Curran Associates, Inc.
[3] Maier, A., Haderlein, T., Eysholdt, U., Rosanowski, F., Batliner, A., Schuster, M., Nöth, E., Peaks – A
System for the automatic evaluation of voice and speech disorders, Speech Communication (2009)

Automatic identification of unremarkable Medical Images

Human interpretable Writer Retrieval and Verification