Index
Deep Learning Computed Tomography based on the Defrise and Clack Algorithm for Specific CBCT Orbits
The RoboCT system enables the exploration of questions that are not feasible with traditional rotating table or gantry setups. By placing the source and detector freely around the object, non-standard trajectories (e.g., circle, helix) can be obtained, which is essential for various reasons such as complex objects, ROI reconstructions, limited angle imaging, etc. However, reconstructing data acquired with these trajectories using an FBP-based algorithm like FDK is not straightforward. Currently, deviations from the circular trajectory are reconstructed using an algebraic reconstruction technique (ART). The parameterization of ART significantly affects the reconstruction quality, especially under challenging conditions such as limited angle, sparse sampling, or truncation artefacts. ART also has drawbacks, including longer computation time compared to FBP and the need to start the iterative process after data acquisition.
Theoretical descriptions exist for FBP-based reconstruction for general trajectories [1, 2]. However, the filtering step cannot be performed with shift-invariant filter kernels like in FDK. Instead, trajectory-specific filter kernels must be derived and determined, making the process complex and time-consuming.
This master’s thesis aims to investigate the possibility of learning invariant or variant filters specific to trajectories using Known Operator Learning [3]. Building on previous work in filter learning [4,5], the implementation will be carried out using the Pyronn framework [6]. The study will also explore whether these filters can be learned purely from simulated data using a specially created phantom and their generalization to real data with different objects under the specified trajectory, based on previous research [5].
[1] Defrise, Michel, and Rolf Clack. “A cone-beam reconstruction algorithm using shift-variant filtering and cone-beam backprojection.” IEEE transactions on medical imaging 13.1 (1994): 186-195.
[2] Oeckl, Steven. “Rekonstruktionsverfahren mit der approximativen Inversen und einer neuen Formel zur Inversion der Röntgen-Transformation.” (2014).
[3] Maier, Andreas K., et al. “Learning with known operators reduces maximum error bounds.” Nature machine intelligence 1.8 (2019): 373-380.
[4] Syben, Christopher, et al. “Precision learning: reconstruction filter kernel discretization.” arXiv preprint arXiv:1710.06287 (2017).
[5] Syben, Christopher, et al. “Known operator learning enables constrained projection geometry conversion: Parallel to cone-beam for hybrid MR/X-ray imaging.” IEEE Transactions on Medical Imaging 39.11 (2020): 3488-3498.
[6] Syben, Christopher, et al. “PYRO-NN: Python reconstruction operators in neural networks.” Medical physics 46.11 (2019): 5110-5115.
Robot Movement Planning for Obstacle Avoidance using Reinforcement Learning
Obstacle avoidance for robotic arms is an important issue in robot control. Limited by factors such as equipment, cost, and labor, some application scenarios require the robot to have the ability to plan its own movement to reach the goal position or state.
In real working environments, where obstacles’ properties are various, the traditional search algorithms are difficult to adapt to large-scale space and continuous action requirements. Artificial potential field (APF) method is a widely used obstacle avoidance path planning method, but it alsohas some shortcomings and will fall into local optimality in some special situations. Based on APF method, reinforcement learning (RL) can theoretically achieve optimization in continuous space. Combined with some modifications to traditional APF method, we define states, and actions and design the reward function, which is regarded as an important part of reinforcement learning, to form a motion planning agent in the
3D world, so that the robot end-effector is able to reach the goal position, emphasizing the avoidance of collision between obstacles and the whole robot arm.
Sinogram Analysis Using Attention U-Net: A Methodological Approach to Defect Detection and Localization in Parallel Beam Computed Tomography
The emergence of deep learning has ushered in a transformative era within the realm of image processing, notably in the context of Computed Tomography (CT). Nevertheless, it is noteworthy that a majority of image processing algorithms traditionally rely on processed or reconstructed images, often overlooking the raw sensor data. This thesis, however, shifts its focus toward the utilization of unprocessed computed tomography data, which we refer to as sinogram. Within this framework, we present a comprehensive three-step deep learning algorithm, leveraging a UNet-based architecture, designed to identify and analyze defects within objects without resorting to image reconstruction. The initial phase entails sinogram segmentation, facilitating the extraction of defect masks within the sinogram. Subsequently, instance segmentation is employed to effectively segregate these masks, resulting in their individualization. Lastly, the isolated masks are subjected to thorough defect analysis. Our research endeavors encompass comprehensive experimentation, conducted on both simulated datasets and real-world data.
Fine-tune large language models for radiation oncology
Optimization and Evaluation of Deformable Image Registration Accuracy for Computed Tomography in Radiation Therapy
Dilemma Zone Prediction with Floating Car Data by Using Machine Learning Approaches
Multipath detection in GNSS signals measured in a position sensor using a pattern recognition approach with neural networks
Comparative Analysis of Different Deep Learning Models for Whole Body Segmentation
Large Language Model for Generation of Structured Medical Report from X-ray Transcriptions
Motivation
Large language models (LLMs) have found applications in natural language processing. In recent years, LLMs have exhibited significant advancements in abstractive question answering, enabling them to understand questions in context, akin to humans, and generate contextually appropriate answers rather than relying solely on exact word matches. This potential has extended to the field of medicine, where LLMs can play a crucial role in generating well-structured medical reports. Achieving this goal necessitates meticulous fine-tuning of LLMs. Abstractive question answering, often referred to as generative question answering, albeit with constraints on word count, leverages techniques such as beam search to generate answers. Ideally, the language model should possess few-shot learning capabilities for downstream tasks. The goal is to generate a structured medical report based on the medical diagnosis from X-ray images.
Background
The dataset comprises two columns: standard reports and structured reports. The model’s objective is to generate structured reports based on standard context. Leading transformer models, such as Roberta( [1]), Bart( [2]), XLnet( [3]), and T5( [4]), excel in generative (abstractive) question answering across multiple languages. These models offer various configurations based on different parameters, each with unique strengths. Some excel in downstream tasks through zero-shot learning or few-shot learning. For instance, models like Flan T5 can effectively handle 1,000 additional downstream tasks. Therefore, fine-tuning these models on a specialized sinusitis dataset is essential. The core pipeline for processing sentences within a transformer model includes positional encoding, multi-head attention for calculating attention scores with respect to other parts of the sentence, residual connections, normalization layers, and feed-forward layers. Practical implementations of these models and tokenizers are readily accessible through the Hugging Face hub. Model accuracy can also be improved using ensemble methods.
Research Objective
In summary, this research aims to automatically convert medical diagnoses from X-ray transcriptions into structured reports using LLMs. The aims of this project are:
- Use data augmentation techniques to finetune pre-trained LLMs with low-resource data.
- Investigate the suitability of different LLMs, e.g., T5, to create structured medical reports.
- Evaluate the proposed approach with open-source radiology reports.
References
[1] S. Ravichandiran, Getting Started with Google BERT: Build and train state-of-the-art natural language processing models using BERT. Packt Publishing Ltd., 2021.
[2] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” arXiv preprint arXiv:1910.13461, 2019.
[3] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” arXiv preprint arXiv:1906.08237, 2019.
[4] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” arXiv preprint arXiv:1910.10683, 2019.