Index
Diffusion Transformer for CT artifacts compensation
Computed Tomography (CT) is one of the most important modality in modern medical imaging, providing invaluable cross-sectional anatomical information crucial for diagnosis, treatment planning, and disease monitoring. Despite its widespread utility, the quality of CT images can be significantly degraded by various artifacts arising from physical limitations, patient-related factors, or system imperfections. These artifacts, manifesting as streaks, blurs, or distortions, can obscure critical diagnostic details, potentially leading to misinterpretations and compromising patient care. While traditional iterative reconstruction and early deep learning methods have offered partial solutions, they often struggle with complex artifact patterns or may introduce new inconsistencies. Recently, diffusion models have emerged as a powerful generative paradigm, demonstrating remarkable success in image synthesis and restoration tasks by progressively denoising an image from a pure noise distribution. Concurrently, Transformer architectures, with their inherent ability to capture long-range dependencies via self-attention mechanisms, have shown promise in various vision tasks. This thesis investigates the potential of Diffusion Transformer, for comprehensive CT artifact compensation. By synergizing the iterative refinement capabilities of diffusion models with the global contextual understanding of Transformers, this work aims to develop a robust framework capable of effectively mitigating a wide range of CT artifacts, thereby enhancing image quality and improving diagnostic reliability. This research explores the design, implementation, and rigorous evaluation of such a model, comparing its performance against existing state-of-the-art techniques.
From Prompt to Command: Adaptation of LLMs for Robotic Task Execution in Manufacturing
Fast heart sound detection using audio fingerprint
Style-based Handwriting Generation with LCM Diffusion Transformer
Depth-Aware Detector Localization in Freehand X-Ray Imaging
Surrogate Model for Physics Informed Lifetime Prediction in Power Electronics based on Mission Profiles
Vision-Language Models for Pathology Report Generation from Gigapixel Whole-Slide Images
Context-Aware Emotion Recognition from Pictures using Frozen CLIP
ThesisProposalVinzenzDeworDynamic Gap Closure Forecasting in the DAX Index
Evaluation and optimization of an implicit neural representation framework for markerless tumor tracking during radiotherapy
In the radiotherapy of tumors, a precise definition of the tumor volume is essential, in order to keep the radiation exposure of the surrounding tissue as low as possible. For this purpose, a planning CT scan is taken before treatment, which is used to define the area to be irradiated, also known as the planning target volume (PTV). The PTV is always chosen to be larger than the actual volume of the tumor in order to ensure that a sufficiently high dose is applied despite uncertainties such as positioning or movement of the tumor volume due to respiration. [1] Especially in the thorax and abdomen, the intrafractional movement due to respiration and physiological changes is very high. In order to compensate for this, the respiratory movement can be measured using external surrogates and imaging techniques and its extent can also be restricted using special breathing techniques. However, these methods only allow an indirect conclusion to be drawn about the tumor position. Although it is possible to measure the movement of the tumor using implanted markers, this is an additional invasive procedure, which is associated with corresponding risks and delays the start of treatment. [2] For the most accurate description of tumor movement, it would be advantageous to automatically segment the tumor on the fluoroscopic x-ray images of the linear accelerator and track its position in real-time. Since the low soft tissue contrast in the fluoroscopy projection images impedes distinguishing the tumor from surrounding structures, tracking the tumor in a synthesized 3D scan volume and then projecting its location onto the 2D x-ray image could improve the segmentation quality. Shao et al. [3] have recently presented this kind of approach with the dynamic reconstruction and motion estimation (DREME) framework. During training they divide the 3D tracking into 2 separate tasks, first a motion estimation consisting of a CNN encoder and a B-spline- based interpolant and second the reconstruction of a reference CBCT scan from the pre-treatment dynamic CBCT projections using implicit neural representations (INR). During inference, the network gets the x-ray projections as input to estimate the motion and deform the reference CBCT volume to synthesize the current real-time CBCT. [3] The goal of this thesis is to re-implement the DREME framework and evaluate its performance on our own dataset of abdominal tumors, since in their paper, Shao et al. [3] report results only on a digital phantom and a lung dataset. Furthermore their reported training time of 4 hours is not feasible in the current clinical workflow, therefore this thesis aims to explore optimization techniques, e.g. pre-training on the planning CT, to reduce the training time.
The thesis will include the following points:
- Literature review on INR deep learning methods;
- Implementation of the DREME framework for real-time motion tracking;
- Performance evaluation on our fluoroscopy dataset;
- Exploration of different strategies to reduce the training time (e.g. pre-training on the planning CT or other patients, higher parallelization, selective loss computation, etc.).
If you are interested in the project, please send your request to: pluvio.stephan@uk-erlangen.de
Prior experience in Python, Deep Learning and PyTorch is required.