Index

Autoregressive Model Based on the Hilbert Curve for CT Artifact Removal

Leakage Detection and Localization in Water Distribution Networks Using Hybrid Modeling and Data-Driven Techniques

Leakage in Water Distribution Networks (WDNs) remains a persistent challenge for utilities, causing significant Non‑Revenue Water (NRW) losses and reducing operational efficiency. Contemporary reviews emphasize the importance of combining hydraulic modelling with data‑driven methods to better support detection and localization of leaks in operational systems [1], [2].
This thesis presents an integrated hybrid framework for leakage detection and localization within a District Metered Area (DMA). A calibrated hydraulic model, developed using widely adopted tools such as EPANET and WNTR, forms the foundation for understanding system behaviour under both normal and leak conditions [3], [4].
For leak detection, SCADA pressure and flow measurements are processed to produce pressure signals from which statistical indicators are derived. These indicators are used to identify periods where leakage is likely to be present. This step establishes when a leak may have occurred in the network.
For leak localization, the hydraulic model is used to simulate representative leak scenarios. The DMA is partitioned into hydraulically coherent zones using established graph‑based clustering approaches [5]. Simulated responses are then used to characterize each zone’s leak
behaviour. A machine‑learning‑based zone classifier provides an estimate of the most likely affected zone, after which a prototype‑based similarity comparison is applied to determine a prioritized set of pipes for investigation [6], aided by rank‑aggregation principles that support consistent prioritization across multiple indicators [7].
The following aspects are covered within the scope of this Master thesis:

  • Reviewing literature on leakage management and hybrid modelling approaches for detection and localization in WDNs [1], [2].
  • Developing and calibrating a hydraulic model using EPANET and WNTR tools, incorporating real-world data from a Danish utility.
  • Implementing a detection workflow based on SCADA pressure signals and statistical indicators to identify periods of likely leakage.
  • Partitioning the DMA into hydraulic zones using spectral clustering to support structured and interpretable localization [5].
  • Generating simulated leak scenarios to produce zone‑level behavioural signatures for localization.
  • Applying a machine‑learning‑based zone classifier and a prototype‑based pipe ranking method, supported by rank‑aggregation concepts to produce a list of candidate leak locations [6], [7].

Together, this two‑stage hybrid method unifies statistical detection, physics‑based simulation, and machine learning to determine both when a leak occurs and where it is likely located, with deliverables that are directly actionable for field inspections.

References
[1] Puust, R., Kapelan, Z., Savić, D. A., & Koppel, T. (2010). A review of methods for leakage management in pipe networks. Urban Water Journal.
[2] Adedeji, K. B., Hamam, Y., Abe, B. T., & Abu‑Mahfouz, A. M. (2017). Towards achieving a reliable leakage detection and localization algorithm for application in water piping networks: An overview. IEEE Access.
[3] Rossman, L. A. (2000). EPANET 2 Users Manual. U.S. EPA.
[4] Klise, K. A., Murray, R., & Haxton, T. (2020). Water Network Tool for Resilience (WNTR). U.S. EPA.
[5] Von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing.
[6] Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical Networks for Few‑Shot Learning. NeurIPS.
[7] Pihur, V., Datta, S., & Datta, S. (2009). RankAggreg: Weighted rank aggregation. BMC Bioinformatics.

Learning Slide Level Representations for Inflammatory Skin Disease Classification with Pathology Foundation Models and Graph Neural Networks

Enhancing explainability of Time Series Forecasting in Smart Infrastructures

Background:

Time-series forecasting guides decisions in finance, energy, supply chains, and healthcare. As automated systems spread, organizations need accurate predictions plus uncertainty quantification, i.e., model confidence, to enable risk-aware choices. Understanding uncertainty depends on time-series basics, aleatoric and epistemic components, and probabilistic modeling; Bayesian frameworks capture data and parameter uncertainty. These explanations help non-technical stakeholders trust forecasts, adoption, and risk-aware actions, motivating estimation and communication.

Observed Gap and Motivation: Despite rapid progress in probabilistic time series forecasting, much of the existing work focuses either on post-hoc explanations of forecasts or on generating quantile-based outputs which model only between aleatoric and epistemic uncertainty without explicitly modeling the underlying sources of uncertainty. This restricts their ability to disentangle model uncertainty from data-related variability and to explain why certain forecasts are more uncertain than others. This lack of interpretability has practical consequences and addressing this problem is therefore critical. A method that can quantify uncertainty while attributing it to meaningful drivers would enable users to understand not only how uncertain a prediction is but also why that uncertainty arises.

Research Objectives: Based on these gaps, this thesis focuses on two core objectives:
1. Investigate whether models that are traditionally deterministic, such as N-HiTS and TimesNet, can be adapted with quantile-based loss functions to provide uncertainty estimates without compromising predictive performance?
2. To develop an approach for explaining forecast uncertainty by analyzing how covariates influence the aleatoric component of the predictive distribution, while keeping the epistemic component intact.
Together, these objectives aim to advance uncertainty estimation from purely descriptive intervals toward explanations that reveal the factors driving uncertainty, enabling more interpretable and
trustworthy forecasting systems.
Outcomes: (i) A framework for practical use of traditional deterministic models for uncertainty estimation and calibration. (ii) A novel approach for inherently explaining the uncertainty of the time
series forecast based on the covariates.


This thesis is part of the “UtilityTwin” project.. The proposed work will be conducted in close collaboration with Siemens AG (Smart Infrastructure) ensuring both academic relevance and industrial applicability.

Robust Tampered Text Detection in Document Images Using Multimodal Deep Learning

The goal of this thesis is to develop a high-accuracy deep learning model for detecting tampered text in document images. This includes manipulations such as word replacement, copy-paste edits, and layout-based alterations. The focus is on building a multimodal architecture that combines visual layout features
and semantic textual content to improve detection accuracy and robustness across diverse document types and manipulation styles.

Synthetic Data Generation and Deep Learning-Based Object Detection and Segmentation for Interventional Devices in Cardiac and Neurovascular Fluoroscopy

Seminar Herculaneum Papyri

1. The Context

In 79 AD, the eruption of Mount Vesuvius buried the Herculaneum library, carbonizing hundreds of papyrus scrolls. For centuries, they were unreadable; opening them would turn them to dust. In 2023, the Vesuvius Challenge changed history. By combining high-resolution CT scans with advanced Computer Vision and Machine Learning, a global community of researchers successfully virtually unwrapped and read parts of these scrolls for the first time.

2. The Problem

While the challenge produced winning results, it also produced “Competition Code.” Competition code is written to win, not to be read. It is often highly optimized and experimental but lacks documentation, theoretical explanations, and clean structure. Crucially, there are no academic papers accompanying these repositories. We have the solution, but we are missing the explanation of the methodology and the mathematical foundations.

3. Your Task

Your goal is to bridge the gap between raw code and a reproducible scientific baseline. You will select a specific component of the challenge (e.g., Segmentation, Ink Detection, Flattening), dissect the code, and transform it into a well-understood, documented research tool. You are not just running scripts; you are performing digital archaeology on the software itself.

4. Organization & Logistics

This is a specialized Computer Vision seminar designed for students who want to deep-dive into applied machine learning and software reverse-engineering.

  • Format & Timeline: This is a Block Seminar taking place between February and May.
  • ECTS Credits: We offer both 5 ECTS and 10 ECTS versions of this seminar depending on the depth of the project and your study requirements.
  • Target Audience: We welcome students from the following study courses:
    • Computer Science
    • Artificial Intelligence
    • Medical Engineering

5. How to Apply

To apply for this seminar, please send an email to linda-sophie.schneider@fau.de and thomas.gorges@fau.de with the following:

  • Your Transcript of Records.
  • A short motivation statement (3–4 sentences) explaining why you want to work on this specific topic.
  • ECTS Preference: Please explicitly state whether you require the 5 ECTS or 10 ECTS version for your studies.

Important: We value brevity, so please keep your email short. Be aware that very long mails might be ignored.

AI-Driven Structured Reporting for Breast MRI Radiological Reports: Leveraging LLMs for Automated Label Extraction

Analyzing Methods for Efficient Language Model Adaptation with Domain-Specific Selective Layer Expansion

MetaMorph: A Unified Framework with Modular Designs for Joint Affine and Deformable Medical Image Registration