Index

Text Embedddings in Pathological Speech

Gen AI: Speech Emotion Recognition

Apply Generative AI strategies in the field of Speech Emotion Recognition

Pre-requirements:

  • Pattern Analysis (Mandatory)
  • Deep Learning (Mandatory)
  • Advanced Depp Learning (Optional)
  • Speech and Language Understanding (Optional)
  • Seminar on pathological speech (Optional)

Please send your grades to paula.andrea.perez@fau.de

Universal Image Artifact Reduction via Heterogeneous Mixture of Experts

Abstract:
This master thesis proposes a novel unified framework for addressing various types of image artifacts through a heterogeneous Mixture of Experts (MoE) architecture. Unlike traditional approaches that tackle specific artifacts individually, our model leverages specialized expert networks, each designed to handle distinct degradation patterns, while maintaining the efficiency. The heterogeneous nature of the experts allows for optimal handling of diverse artifact types, from compression artifacts to motion blur, within a single unified model.

Utilizing LLMs for medication data annotation in german medical texts

Deep-learning based Long-tailed multi label disease classification

Similar to other diagnostic medical tests, chest radiography yields clinical results with a long tail distribution; the majority of disorders are very infrequent, but a small group is often encountered [1]. Standard deep learning techniques are challenged by this, since they show bias towards the most common classes at the expense of the significant but uncommon tailclasses [2]. This specific sort of imbalance has been addressed by several current approaches [3], while long-tailed medical picture identification difficulties [4] have just lately received attention. Chest X-ray (CXR) diagnosis is a multi-label challenge since patients frequently present with several illness signs at the same time. Nevertheless, very few research include label co-occurrence in the learning process [5]. The long-tailed, multi-label nature of tasks like disease diagnosis on CXRs poses class imbalance and co-occurrence problems, which many standard deep learning methods are unable to handle because the majority of large-scale image classification benchmarks contain single-label images with a mostly balanced distribution of labels [2].

Compared to the supervised models, the CLIP [6] model exhibits more robustness to the data imbalance [7], and its feasibility of allowing one to develop own classifiers makes it a clear baseline for this. In this thesis, we aim to focus on the development of CLIP based model for our CXR classification problem.

Initially, existing neural network architectures, such as ResNets[8] and BERT [9] will be used as baseline encoders for the CLIP model to benchmark the performance, followed by proposing our algorithm that aims to address the problem of long-tailed multi-label disease classification of chest x-ray image dataset given by MIMIC-CXR.

Previously, Clip-based algorithms such as Medclip [10] and CheXzero [11] addressed the issue by presenting the zero-shot method, we aim to leverage these approaches and create a better algorithm for the classification of long-tailed diseases.

 

References
[1] S. Kevin Zhou, Hayit Greenspan, Christos Davatzikos, James S. Duncan, Bram Van Ginneken, Anant Madabhushi, Jerry L. Prince, Daniel Rueckert, and Ronald M. Summers. A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE, 109(5):820–838, May 2021.
[2] Gregory Holste, Song Wang, Ziyu Jiang, Thomas C. Shen, George Shih, Ronald M. Summers, Yifan Peng, and Zhangyang Wang. Long-Tailed Classification of Thorax Diseases on Chest X-Ray: A New Benchmark Study, page 22–32. Springer Nature Switzerland, 2022.
[3] Yifan Zhang, Bingyi Kang, Bryan Hooi, Shuicheng Yan, and Jiashi Feng. Deep long-tailed learning: A survey, 2023.
[4] Lie Ju, Xin Wang, Lin Wang, Tongliang Liu, Xin Zhao, Tom Drummond, Dwarikanath Mahapatra, and Zongyuan Ge. Relational subsets knowledge distillation for long-tailed retinal diseases recognition, 2021.
[5] Guoli Wang, Pingping Wang, Jinyu Cong, Kunmeng Liu, and Benzheng Wei. Bb-gcn: A bi-modal bridged graph convolutional network for multi-label chest x-ray recognition, 2023.
[6] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision, 2021.
[7] Xin Wen, Bingchen Zhao, Yilun Chen, Jiangmiao Pang, and Xiaojuan Qi. Generalization beyond data imbalance: A controlled study on clip for transferable insights, 2024.
[8] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition, 2015.
[9] Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4):1234–1240, September 2019.
[10] Zifeng Wang, Zhenbang Wu, Dinesh Agarwal, and Jimeng Sun. Medclip: Contrastive learning from unpaired medical images and text, 2022.
[11] Pujan Patel Curtis P. Langlotz Andrew Y. Ng Pranav Rajpurkar Ekin Tiu, Ellie Talius. Expert-level detection of pathologies from unannotated chest x-ray images via self-supervised learning, 2022.

Investigating Word Class Representation in LLMs Using „Probes“

Evaluation of Quantum Annealing based Projection Selection for Emission Tomography

Differential privacy for securing speech-based deep learning models against gradient inversion attacks

Detection of Birds and Marine Mammals in Aerial Image Sequences using Artificial Intelligence Methods

Detecting the birds and marine mammals from aerial images allows to monitor the evolution of their populations over time. As this is a tedious task, when done manually, reliable automatic methods using artificial intelligence are highly desired. This task differs from many standard object detection methods due to the high resolution of images (18 megapixels for the considered dataset) and small size of the animals (some are less than 50 square pixels). Also, changing waves and reflections on the water increase the difficulty of the task.

This thesis will focus on two main points. First, train, evaluate, and compare some standard object detection methods, such as Faster-RCNN. Second, replicate the method presented in “POLO – Point-based, multi-class animal detection”, and evaluate its performance on the considered dataset. The evaluations will also include some analysis of eventual links between accuracy and image quality (e.g., image luminosity or amount of waves). If time allows for it, tracking animals over multiple frames will be attempted.

Wind Power Forecasting through Probabilistic Machine Learning Models

Wind power is a clean, renewable energy source that is gaining popularity for electricity generation. However, because wind speed can be fluctuating, integrating large amounts of wind power into electrical grids can pose challenges to their stability and uncertainty. This project wants to solve this by making a model that can predict many possible outcomes. The primary goal of this project is to develop and evaluate various ML models for forecasting wind power generation over different time frames. Utilizing weather data, including wind speed and power output from wind farms, the project seeks to identify important features necessary for making both short-term and long-term forecasts.

Objectives
● To train data on different machine learning models that predict many possible outcomes for wind power.
● Perform data analysis and identify the features that are important for forecasting of wind power
● To evaluate different ML models to see which models provide the best forecasting for the wind power.
● To forecast the wind power generation for short-term and long-term durations.
● Compare the short -term and long-term forecasting and investigate which features are weighted in both durations.
● To what extent the forecasting influences the effectiveness of different ML techniques on various data sources

DataSet : https://data.open-power-system-data.org/time_series/
● Data Collection: Collect past weather data like wind speed and direction, along with how much power wind farms produced.
● Data Preprocessing: data will undergo cleaning to address missing values, outliers and normalisation.
● Model Development:
1. Use techniques like Neural Networks to start making the models.
2. Long Short-Term Memory (LSTM) and Temporal Fusion Transformers (TFT) models are well-suited for forecasting tasks like probabilistic wind and climate power prediction for short-term horizons.
3. Combine several models to get better predictions.
● Model Training and Validation: Train the models with wind power temporal data
● Performance Evaluation: Check how good the models are forecasting using specific scores that tell us how accurate the predictions are. Eg RMSE: Root Mean Square Error, CRPS: Continuous Ranked Probability Score, Cross Validation.