Index

Benchmarking Automatic Speaker Anonymization Methods for Healthy Speech

Stammering Identification using Large Language Models

Master Thesis – Annotation by Speech in Radiology

This thesis explores using speech as a direct annotation modality for medical image analysis.



Diffusion Model-Based Compensation of T2-induced Blurring in Ultrashort TE MRI

How Broken a Coil Must Be?

Investigating Liquidity Forecasting with Point-Based and Probabilistic Models to Enhance Financial Business Operations

Enhancing SBOM Creation with Large Language Models

Producing Synthetic Data for Better Defect Detection

Can the training workflow and performance of CNNs for defects detection be improved by buttressing the training database with synthetic images of defective parts generated using GANs?

In the manufacture of high-end cinematographic lenses, a careful inspection needs to be performed on anodized aluminum housings of lenses to look for scratches on the surface of cases before they are delivered to the customers. To automatize this inspection process, traditional image processing algorithms are very limited because the anodized aluminum surface has a granular surface, when it is observed carefully. In order to provide a more general solution yet a robust result, a convolutional neural network (CNN) is used to carry out this inspection. However, training CNNs is challenging due to the low amount of available defect images.

To overcome these challenges, the proposed master’s thesis will investigate the use of Generative Adversarial Networks (GANs) to generate synthetic data for the minority classes for CNN training as suggested in the original paper of GANs. Those synthetic images from GANs will then be used to train a CNN for defect detection and the performance of the newly trained CNN will be measured by its classification accuracy on samples from real world.

The proposed research will involve three stages:

  1. Finalizing optimizations to the hardware of the laboratory setup

  2. do some basic testing to decide which of the following GAN approaches will be taken

    1. Transparent defect image

In this approach, scratches will be generated through GANs. The synthetic scratches from the GANs will be overlayed to different real scratch-free surface images to enlarge the training dataset for the defect detection CNN.

    1. Patch based/Inpainting approach

In this approach, a GAN which can generate patches of surfaces with scratches will be trained. The patches generated need to be able to be seamlessly integrated into a real surface. The approach is similar to the idea of Inpainting, with the masked area being filled by synthetic scratches instead of visually realistic pixels.

    1. Image-to-image translation

In this approach, a GAN will either turn a defect map(2D array indicating the desired locations of scratches) into a synthetic surface with scratches, or a real surface with scratches into a defect map. The second approach shows some resemblances to the task of semantic segmentation in the paper .

  1. Training a CNN with a dataset generated from GAN and evaluate performance of the trained CNN with data unknown to the CNN, acquired from a variety of aluminum housing geometries and the laboratory setup with different settings.

[1]

Goodfellow, I., Pouget-Abadie, J., Mirza, M. et al., „Generative Adversarial Networks,“ Communications of the ACM, Bd. 63, Nr. 11, pp. 139-144, 2020.

[2]

Pathak et al., „Context encoders: Feature learning by Inpainting,“ 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[3]

P. Isola, J.-Y. Zhu, T. Zhou und A. Efros, „Image-to-image translation with conditional adversarial networks,“ 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

Optimizing Remote Scanning Workflow with Automated Data Extraction from Streamed Content

Remote scanning allows medical technologists to operate scanning systems from a remote location. Scanners are connected via KVM-switches (Keyboard, Video, Mouse), as there is no other interface standard existing. However, KVM only streams video and control signals, without transmitting structured data like scan progress or warnings from the scanner. This limitation creates a significant challenge for remote operators, may miss important system events or updates, leading to potential miscommunication with onsite staff. This can result in delays in patient care and depend on the onsite team to monitor patient conditions effectively during scans.

Final Goals:
Prototype Methods to capture data from scanning console video streams in real time such as-
a) Error or Warning popup messages in different languages, from different interfaces
b) Calculate process progression from the progress bar.

Sign Language Recognition Using Transformer and Comparison with Traditional Techniques

This thesis is about creating a system to recognize sign language using transformer networks and
comparing it with older methods. The aim is to build a system that is both effective and accurate by
using transformer models, which are good at handling sequences of data, to understand and interpret
sign language. The study will include collecting data, preparing it, training models, evaluating them, and
comparing the results with traditional methods like CNNs.

The main idea of this thesis is to use transformer networks for recognizing sign language. Unlike
traditional models that process data step-by-step, transformers can handle entire sequences at once,
which improves understanding and accuracy. The system will use different types of data (e.g., video) to
be more robust and accurate. This research will compare transformers with traditional methods like
CNNs to show the benefits and possible improvements of transformers in sign language recognition.