Index

Fine-tune large language models for radiation oncology

Optimization and Evaluation of Deformable Image Registration Accuracy for Computed Tomography in Radiation Therapy

Dilemma Zone Prediction with Floating Car Data by Using Machine Learning Approaches

Multipath detection in GNSS signals measured in a position sensor using a pattern recognition approach with neural networks

Comparative Analysis of Different Deep Learning Models for Whole Body Segmentation

Large Language Model for Generation of Structured Medical Report from X-ray Transcriptions

Motivation

Large language models (LLMs) have found applications in natural language processing. In recent years, LLMs have exhibited significant advancements in abstractive question answering, enabling them to understand questions in context, akin to humans, and generate contextually appropriate answers rather than relying solely on exact word matches. This potential has extended to the field of medicine, where LLMs can play a crucial role in generating well-structured medical reports. Achieving this goal necessitates meticulous fine-tuning of LLMs. Abstractive question answering, often referred to as generative question answering, albeit with constraints on word count, leverages techniques such as beam search to generate answers. Ideally, the language model should possess few-shot learning capabilities for downstream tasks. The goal is to generate a structured medical report based on the medical diagnosis from X-ray images.

Background

The dataset comprises two columns: standard reports and structured reports. The model’s objective is to generate structured reports based on standard context. Leading transformer models, such as Roberta( [1]), Bart( [2]), XLnet( [3]), and T5( [4]), excel in generative (abstractive) question answering across multiple languages. These models offer various configurations based on different parameters, each with unique strengths. Some excel in downstream tasks through zero-shot learning or few-shot learning. For instance, models like Flan T5 can effectively handle 1,000 additional downstream tasks. Therefore, fine-tuning these models on a specialized sinusitis dataset is essential. The core pipeline for processing sentences within a transformer model includes positional encoding, multi-head attention for calculating attention scores with respect to other parts of the sentence, residual connections, normalization layers, and feed-forward layers. Practical implementations of these models and tokenizers are readily accessible through the Hugging Face hub. Model accuracy can also be improved using ensemble methods.

Research Objective

In summary, this research aims to automatically convert medical diagnoses from X-ray transcriptions into structured reports using LLMs. The aims of this project are:

  • Use data augmentation techniques to finetune pre-trained LLMs with low-resource data.
  • Investigate the suitability of different LLMs, e.g., T5, to create structured medical reports.
  • Evaluate the proposed approach with open-source radiology reports.

References

[1] S. Ravichandiran, Getting Started with Google BERT: Build and train state-of-the-art natural language processing models using BERT. Packt Publishing Ltd., 2021.
[2] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, “Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,” arXiv preprint arXiv:1910.13461, 2019.
[3] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” arXiv preprint arXiv:1906.08237, 2019.
[4] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” arXiv preprint arXiv:1910.10683, 2019.

Style Transfer of High-resolution Photos to Artworks

Investigating the benefits of combining CNNs and transformer architecture for rail domain perception task

ThesisDescription

Eye Tracking and Pupillometry for Cognitive Load Estimation in Tele-Robotic Surgery

Inferring the cognitive load of a surgeon during robotic surgery is important to
ensure safe and effective outcomes for patients, as high cognitive load can can
lead to errors and impact performance in robot command. This information
about cognitive load can be used in training to improve user skill.

One approach to estimate the cognitive load is, to utilize eye gaze and pupillometry
measurements, which have already been demonstrated as a potential
solution to this problem. As is has been shown, that the pupil diameter is related
to the task difficulty [1–3].

In the scope of this work, eye gaze and pupillometry measurements and tool
information will be used to infer user skill and proficiency in robot command.
Therefore, the eyetracker must be calibrated to the da Vinci robot vision pipeline
with a SPAAM-type of calibration [4, 5], and tool tracking methods in robotic
surgery must be developed.

 

References:

[1] Andrew T. Duchowski, Krzysztof Krejtz, Nina A. Gehrer, Tanya Bafna, and
Per Bækgaard. The low/high index of pupillary activity. In Proceedings of
the 2020 CHI Conference on Human Factors in Computing Systems, CHI
’20, page 1–12, New York, NY, USA, 2020. Association for Computing Machinery.

[2] Andrew T. Duchowski, Krzysztof Krejtz, Izabela Krejtz, Cezary Biele, Anna
Niedzielska, Peter Kiefer, Martin Raubal, and Ioannis Giannopoulos. The
index of pupillary activity: Measuring cognitive load vis-`a-vis task difficulty
with pupil oscillation. In Proceedings of the 2018 CHI Conference on Human
Factors in Computing Systems, CHI ’18, page 1–13, New York, NY, USA,
2018. Association for Computing Machinery.

[3] Krzysztof Krejtz, Andrew T. Duchowski, Anna Niedzielska, Cezary Biele,
and Izabela Krejtz. Eye tracking cognitive load using pupil diameter and
microsaccades with fixed gaze. PLOS ONE, 13(9):1–23, 09 2018.

[4] Kenneth R. Moser, Mohammed Safayet Arefin, and J. Edward Swan. Impact
of alignment point distance and posture on spaam calibration of optical seethrough
head-mounted displays. In 2018 IEEE International Symposium on
Mixed and Augmented Reality (ISMAR), pages 21–30, 2018.

[5] Mihran Tuceryan, Yakup Genc, and Nassir Navab. Single-Point Active
Alignment Method (SPAAM) for Optical See-Through HMD Calibration
for Augmented Reality. Presence: Teleoperators and Virtual Environments,
11(3):259–276, 06 2002.

Human motor intention decoding from neuroimaging data with explainable feature importance maps