Index
Frequency Domain Hierarchical Vision Transformer-based Perceptual Loss
This project focuses on improving image processing tasks, such as super-resolution or image restoration, by employing a novel feature comparison method. It leverages a Hierarchical Vision Transformer to extract multi-scale feature representations from images. These features capture both local and global information at various levels of abstraction. Crucially, these extracted features are then transformed into the frequency domain, likely via a Fast Fourier Transform (FFT) or similar method. The comparison between the generated image and the target image occurs in this frequency space. By analyzing differences in magnitude and/or phase across different frequency bands, the model can better understand and rectify discrepancies in texture, detail, and overall structure. This approach aims to produce perceptually superior results by guiding the model to reconstruct images that are more aligned with the frequency characteristics of the target, leading to improved visual quality, especially in terms of sharpness and fine-grained details.
Generative Modeling for Glottal Signals Synthesis
Improved Deep Learning Dose Prediction for Automated Head & Neck Radiotherapy Treatment Planning
Radiation therapy is one of the most important local cancer treatment modalities, enabling the non-invasive delivery of a spatially varying dose distribution within a patient’s body with high precision. Radiotherapy treatment planning aims to identify the optimal treatment plan that maximally spares surrounding normal tissue structures while delivering a high dose to the target volume. Deep Learning dose prediction is a novel technique that can automate the treatment planning process, improve standardization and reduce the treatment planning time to enable novel and improved therapies like same-day treatment and adaptive radiotherapy [1]. Deep learning dose prediction uses deep learning models like 3D U-Nets [2] to predict spatial dose distributions based on planning CT datasets, target volume, and organs at risk (OAR) segmentations [1, 3] (Figure 1). Subsequently, dose mimicking algorithms are used to translate these 3D dose predictions into deliverable treatment plans [1, 3–5]. An additional resource for dose prediction research is provided by the Open Knowledge-Based Planning Challenge (OpenKBP) [6], which offers a publicly available dataset and a standardized evaluation framework. The aim of this master thesis project is to improve an existing 3D
U-Net dose prediction model and evaluate its performance using a cohort of more than 450 Head & Neck cancer datasets from the Department of Radiation Oncology, University Hospital Erlangen.

The thesis will include the following points:
• Literature review on deep learning techniques for radiotherapy dose prediction and automated treatment planning.
• Data preprocessing for the private Head & Neck treatment plan dataset including conversion of DICOM (CT, RT DOSE, RT STRUCT) files into CSV files and label maps. Automatic cleaning and splitting of the datasets into subgroups according to plan parameters using custom Python scripts (Pydicom library).
• Training, Inference and Testing of a pre-existing 3D U-net (Figure 2) dose prediction model (Keras, PyTorch) on the cohort of 450 Head & Neck cancer datasets.
• Reframe of the pre-existing dose prediction model to enable multi-GPU training with PyTorch. Enable increased resolution of the predicted voxel grid using multi-GPU training, splitting into subvolumes and/or further strategies to decrease memory consumption (e.g., Automatic Mixed Precision). Optimization of training time and GPU usage by shifting pre-processing steps to before the training phase.
• Extension of the pre-existing dose prediction model by implementing an on-line augmentation pipeline tailored to the task of dose prediction. Hyperparameter optimization on the validation dataset.
• Optional: Exploration of additional deep learning architectures for radiotherapy dose prediction including novel techniques like diffusion models [7, 8] and transformer architectures [9, 10].
• Optional: Adaption of the developed dose prediction pipeline to lung tumors with a data set of 130 patient plans.
• Detailed evaluation of the deep learning dose prediction performance on the test dataset comparing automated treatment plans to manually created treatment plans. Comparison of the different deep learning approaches in regard to accuracy and inference time. Five training and inference repeats to enable statistical analysis.

If you are interested in the project, please send your request to: johann.brand@uk-erlangen.de
Having prior experience with building neural networks in Python, especially using frameworks such as PyTorch or TensorFlow, will greatly help to develop the project.
Link Prediction on Utility Networks Using Graph Neural Networks
Abstract:
Utility network is a commonly used term for a collection of physical infrastructure components such as pipes, valves, pumps, etc., pipes that supply utilities like heat, water, electricity, and gas throughout the city. Structurally, the network of interconnected components can be replicated digitally with the help of GIS measurements. An accurate digital representation is crucial to maintain data integrity and ensure operational reliability. However, when physical components are digitally represented, measurement inaccuracies are introduced, diminishing the reliability of the digital model and impeding the process of deriving meaningful information. These inaccuracies often appear as missing pipes or disconnected networks due to translational errors. This thesis aims to formulate and tackle this problem as a graph based link (edge) prediction task using Deep Learning (DL), through Graph Neural Networks (GNNs). We apply the theory and methodology of SEAL (learning from Subgraphs, Embeddings, and Attributes for Link prediction) [1], to the domain of utility networks. SEAL extracts local enclosing subgraphs around a link to learn patterns and predicts the existence of links. Additionally, we compare the effect of using pre-trained node2vec embeddings to embeddings learned simultaneously with the GNN model while experimenting with two different graph structures – homogeneous and heterogeneous-bipartite representations. We applied the methodology to real-world heat and water networks from Denmark. Overall, the pre-trained node2vec embeddings consistently outperformed those simultaneously learned with the GNN model. The optimal choice for the graph structure varied between the heat and water networks. Our experimentation on the heat network shows that heterogeneous-bipartite representation yielded better results, with an AUC score of 98% on the test set. In the case of the water networks, both the heterogeneous bipartite and the homogeneous representations produced comparable results, with an AUC score of 95%.
References:
This thesis is part of the “UtilityTwin” project.
Emotion recognition Project
Parkinson’s Disease Assessment Using Gait Analysis
Aphasia Assessment Using Deep Learning
Investigation of appropriate measures for the automated preparation and processing of modifications in diagnostic data for commissioning in vehicle production
Investigation of appropriate measures for the automated preparation and processing of modifications in diagnostic data for commissioning in vehicle production.
Metal Artifact Reduction in Computed Tomography Using Deep Learning Method
Building Knowledge Graphs from Legal Texts: Enhancing Decision Support with Applications in Formula 1
Legal documents, such as the FIA rulebook, are complex and difficult to navigate. Understanding these texts is time-consuming and prone to error. This thesis proposes using Natural Language Processing (NLP) and Knowledge Graphs (KGs) to transform legal texts into queryable, visual formats that simplify decision-making. Formula 1 will serve as a case study.
Objectives:
• Develop a pipeline to convert legal texts into navigable knowledge graphs.
• Create a queryable system for understanding relationships, exceptions, and dependencies.
• Detect inconsistencies and ambiguities in legal texts.
• Generalize the framework to apply to multiple legal domains. Approach
• Theoretical: Study legal text syntax/semantics, NLP techniques (e.g., BERT, GPT), and KG principles for modeling legal complexities.
• Practical:
o Build the KG using tools like Neo4j and visualize relationships between entities.
o Use ML algorithms to flag ambiguities or conflicts.
o Develop a natural language query interface for user-friendly interaction.