Index
Reinforcement Learning in Optimum Order Execution
Empathetic Deep Learning to the Rescue: Speech Emotion Recognition from Adults to Children
Emotional states are strong influential factors of humans’ choices, activities, and desires. They can be evaluated from face, self-observing reports and, what this thesis focuses on, speech. While there is some research done in speech emotion recognition it has less exploitation of deep learning approaches due to the field’s recentness and recent improvements in computational and optimizational approaches. In addition, the complicatedness of collecting improvised data, not from professional adult actors remains present in the state-of-the-art literature. Thus, the goal of this thesis is to explore the area of speech emotion recognition in children by testing the predominant approaches of neural networks with temporal prosody as well as abruptly expanding Transformers methods. We investigate the potential of transfer knowledge applied from adults’ to children’s data as the mechanism of dealing with lacking data. From the outcomes, we observe the improvement in the opportunities of transfer knowledge when gender and cultural aspects are included into the classification of emotions. Emotionally intelligent systems built based on the experiments described in the thesis can benefit the fields of remote monitoring or telemedicine for psychologists and pediatrists, teaching emotional intelligence for autistic children, and improving children’s health diagnostics and scanning procedures.
Classical Acoustic Markers for Depression in Parkinson’s Disease
Parkinson’s disease (PD) patients are commonly recognized for their tremors, although there is a wide range of different symptoms of PD. This is a progressive neurological condition, where patients do not have enough dopamine in the substancia nigra, which plays a role in motor control, mood, and cognitive functions. A really underestimated type of symptoms in PD is the mental and behavioral issues, which can manifest in depression, fatigue, or dementia. Clinical depression is a psychiatric mood disorder, caused by an individual’s difficulty in coping with stressful life events, and presents persistent feelings of sadness, negativity, and difficulty managing everyday responsibilities. This can be triggered by the lack of dopamine from PD, the upsetting and stressful situation of the Parkinson’s diagnosis as well as by the loneliness and isolation that can be caused by the Parkinson’s symptoms.
The goal of this work is to find the most suitable acoustic features that can discriminate against depression in Parkinson’s patients. Those features will be based on classical and interpretable acoustic descriptors. 
Detection of Arterial Occlusion on MRI Angiography of the Lower Limbs using Deep Learning
Proposal Tri NguyenAutomated detection and defect recognition of photovoltaic modules in photoluminescence videos
Automatic Detection of Microorganisms on Microscopic Images of Fluid Samples using Machine Learning
The objective of this thesis is to apply machine learning tools to rapidly analyze large datasets of microscopic images to identify and classify microbial infections.
Microorganisms can cause a wide range of diseases, e.g. tuberculosis, and left untreated, infections can quickly become fatal [1]. Fortunately, the discovery of penicillin and the subsequent invention of other antibiotics has significantly decreased the lethality of these diseases. This early success has led to an era of frequent use of antibiotics [2]. However, overprescription of these drugs has caused bacteria to develop mechanisms that confer resistance against particular drugs. This emergence of multi-drug resistance strains poses an extreme risk [2]. Therefore, these developments necessitate a more targeted application of antibiotics, which, however, requires the classification and characterization of bacteria found in patient samples. Existing methods for classifying microorganisms can be categorized into chemical, physical, molecular biological, and morphological methods [1]. While the latter is positioned to be the most direct and cost-effective method of the four, it requires a high amount of manual work and is thereby laborious and time-consuming [1].
The Weiss group develops and applies technologies for rapidly imaging and analyzing biological samples using high-resolution fluorescence microscopy, microfluidics. In collaboration with the Pattern Recognition Lab of the Friedrich-Alexander-Universität Erlangen-Nürnberg, this toolkit is extended by computer vision components to analyze the image data.
Computer vision tools are well-suited for image classification problems and have been widely applied to microscopy. On relatively pure laboratory samples these tools can perform astonishingly well [1]. However, a single droplet of a fluid from a patient sample is significantly more challenging as it can contain a large array of different types of objects in very large quantities which complicates the detection and classification of single objects. Under the assumption that high quality and high-resolution microscopy images are provided, the key challenges are thus twofold: first to find the objects of interest and second to classify them.
In general, various approaches to solving this problem can be considered:
– Pursuing a supervised approach by establishing a large, bounding box or segment annotated database to train a deep neural net that can process unseen data and extract location and class of objects.
– Following semi-supervised techniques by only establishing a small, annotated database to train the classifier and using uniform or random segments of the input data.
– Processing uniform or random segments unsupervised by clustering them in multiple distinct classes.
In this thesis, we will investigate these strategies to address the problem of automatic object detection in microscopic images of fluid samples.
The thesis comprises the following tasks:
– literature review concerning sources and state-of-the-art approaches to construct classifiers with limited data available
– Implementation of one or more solutions to address the classification problem
– Evaluation of proposed method and emerging challenges
– Documentation and presentation of the findings, documentation of code
– Discussion of progress in weekly meetings with mentors Dr. Lucien Weiss and Frauke Wilm
[1] Zhang, Jinghua, Chen Li, and Marcin Grzegorzek. “Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to Popular Convolutional Neural Network and Potential Visual Transformer.” arXiv preprint arXiv:2108.00358 (2021).
[2] Casadevall, Arturo. “Crisis in infectious diseases: time for a new paradigm?.” Clinical infectious diseases 23.4 (1996): 790-794.
Synthesizing Art Historical datasets with Pixel-wise Annotations
Analysis of EEG data with machine learning
Heat Demand Forecasting with Multi-Resolutional Representation of Heterogeneous Temporal Ensemble
Accurate forecasting of heat consumption plays an important role in effective management of heating utility, such as, unit commitment, short term maintenance, network’s power flow optimization, etc. Inaccurate forecasting of consumption may lead to increase in operating cost. Over-forecasting leads to unnecessary reserved cost and excess supply. Under forecasted loads result in high expenditures in the peaking unit. Hence it is important for a utility, such as a district heating network to forecast Heat consumption patterns could vary depending upon the external temperature and day of usage, such as holiday or weekend.
The consumption pattern also varies based upon consumer type, operational reasons and consumer activities. This gives rise to the motivation of capturing the load profile feature of a certain consumer to tackle model variance when using ML based forecasters. This means, the forecasting model should capture information (feature) of a user’s usage/consumption pattern from three dimensions: Time, Frequency and Magnitude. The time series of heat consumption consist of several discontinuities or abrupt jumps which may carry important information. Therefore a highly accurate prediction of heat consumption of an end-user could be yielded by incorporating the discontinuities through the approximation of functional non-linearity. Moreover, the consumption pattern also varies based upon consumer type, operational reasons and consumer activities. Therefore in order to ensure the generalizability of the model across different types of end-users the forecasting model should capture features or information about the consumption pattern of different types of end-users. This project also investigates the research question of generalizability of the model through the evaluation of performance quantitatively and qualitatively. The thesis consists of the following aspects:
- Literature review of heat consumption classification in district heating network.
- Analysis and understanding of the heat consumption data from a utility.
- Development of a sophisticated heat consumption forecaster.
- Comprehensive evaluation of the forecasting performance by comparing with existing forecasting models.
References:
[1] Y. Zhao, Y. Shen, Y. Zhu and J. Yao, “Forecasting Wavelet Transformed Time Series with Attentive Neural Networks,” 2018 IEEE International Conference on Data Mining (ICDM), 2018, pp. 1452-1457, doi: 10.1109/ICDM.2018.00201.
[2] Chatterjee, Satyaki and Bayer, Siming and Maier, Andreas K “Prediction of Household-level Heat-Consumption using PSO enhanced SVR Model”, NeurIPS 2021 Workshop on Tackling Climate Change with Machine Learning, 2021, https://www.climatechange.ai/papers/neurips2021/42
[3] Kováč, Szabolcs and Micha’čonok, German and Halenár, Igor and Važan, Pavel,” Comparison of Heat Demand Prediction Using Wavelet Analysis and Neural Network for a District Heating Network”, Special Issue “Artificial Intelligence in the Energy Industry”, https://www.mdpi.com/1996-1073/14/6/1545
Design and Evaluation of Machine Learning Applications for Space Systems
This thesis aims at designing and evaluating two open source and representative machine learning applications for on-board data processing in space systems, as part of the OBPMark-ML benchmarking suite. Recently there is an increased interest for the adoption of machine learning and artificial intelligence methods in on-board processing, as demonstrated in the European Space Agency’s (ESA) Phi-Sat-1 In-Orbit-Demonstration (IOD) mission launched in 2020 [1,2]. In addition, future missions, are expected to rely on machine learning and deep learning methods to offer increased autonomy.
However, it is not clear which hardware architectures should be employed in such future systems. Currently, space systems use simple processors specifically designed for space, which cannot provide the required performance. Therefore, several alternatives are currently investigated as future candidates for space, such as embedded GPUs, FPGAs, custom AI accelerators etc.
However, different devices have significantly different properties, e.g. in terms of number of computations per second, numerical format (integer or floating point), data width (1, 8, 16, 32 or 64 bits), memory requirements etc. which make the comparison and trade-offs of their computation performance and accuracy difficult.
In addition, there is a lack of representative application cases that can be used to perform such comparison. MLPerf, the de facto benchmarking suite for machine learning is not representative of the type of processing required in space. The only available space-related software is the open source benchmarking suite GPU4S\_Bench [3,4] (also known as OBPMark Kernels) and its evolution OBPMark (On-Board Data Processing Benchmarks) Applications, developed at the Barcelona Supercomputing Center (BSC) as part of the ESA-funded project GPU4S (GPU for Space)[5]. However, GPU4S\_Bench provides software kernels solely on algorithmic building blocks such as matrix multiplication and convolution used in deep learning, as well as a simple inference chain targeting CIFAR-10, but without evaluating performance and accuracy trade offs, nor using real space data sets for training and/or evaluation. In contrast to this, the OBPMark suite implements a set of computational performance benchmarks developed specifically for spacecraft on-board data processing applications, like radar processing and data compression [6]. OBPMark-ML is going to be a third variant of OBPMark, which will include realistic space applications covering multiple types of machine learning and deep learning processing.
Two of these applications are going to be designed and implemented in this Master’s Thesis, which will be performed during a research visit at the Barcelona Supercomputing Center, within the GPU4S ESA project. The applications will cover two different types of imaging tasks and will be trained with real space data. The design and implementation will cover both the training and the implementation parts of the applications in standard machine learning frameworks, both in Python and C. It will also include the production of the trained models in various formats, as well as the necessary material for the reproduction of the training for potentially new architectures which are not covered by the pre-trained models which will be generated. In particular:
- Instance Segmentation: Cloud Screening: The first task uses instance segmentation for cloud detection on an open source data set called Cloud95 [7]. U-Nets have become a standard approach for segmentation tasks and were also shown to be effective on cloud screening tasks [8]. Constraints for the use in on board processing are the enormous amount of parameters and the high computational cost, though. A good trade-off between computational complexity/memory footprint and prediction accuracy has to be found. For this the number of parameter needs to be scaled down. This can be achieved through a reduction of the depth and the number of filters in convolution layers. Another method is using techniques from the MobileNetv2 architecture and adapt them to the U-Net architecture [9]. Both methods will be evaluated.
- Object Detection: Ship Detection: The second application will be an object detection task, like ship detection on satellite pictures [10]. The same restrictions like in the segmentation case apply here. State-of-the-art architectures for object detection are single shot detectors (SSD) [11] and “You only look once” (YOLO) networks [12]. These architectures can use different backbones models. To reduce the number of parameters and the computational complexity, a MobileNet will be used in comparison to a heavy network like ResNet [13].
The models will be trained on Tensorflow/Keras and PyTorch, as well as converted to the ONNX format for portability and reproducibility. As different hardware like i.e. FPGAs require fixed point arithmetic, all models will be trained and provided in different precisions ranging from (double/full/half) floating point to integer (int8/int16). BSC will provide access to supercomputing resources for the training process. Accuracy-wise, they are to be compared qualitatively with literature results. The computational performance will be tested on some processor of interest, like i.e. the Myriad VPU or a Xilinx FPGA leveraging Vitis AI or embedded GPUs which have been identified as candidates for use in GPU4S. This depends of the accessibility of these development boards. The thesis is developed together with the Barcelona Computer Center (BSC) in the GPU4S program co-funded by the European Space Agency and all code and models will be published open source on Github under the current ESA-PL license.
[1] Jan-Gerd Meß, Frank Dannemann and Fabian Greif. Techniques of Artificial Intelligence for Space Applications – A Survey. In European Workshop on On-Board Data Processing (OBDP), 2019.
[2] https://www.esa.int/Applications/Observing_the_Earth/Ph-sat
[3] Ivan Rodriguez, Leonidas Kosmidis, Jerome Lachaize, Olivier Notebaert, David Steenari, GPU4S Bench: Design and Implementation of an Open GPU Benchmarking Suite for Space On-board Processing, Technical Report UPC-DAC-RR-CAP-2019-1, [online] Available: https://www.ac.upc.edu/app/research-reports/public/html/research_center_index-CAP-2019,en.html
[4] Leonidas Kosmidis, Iván Rodriguez, Alvaro Jover-Alvarez, Sergi Alcaide, Jérôme Lachaize, Olivier Notebaert, Antoine Certain, David Steenari. GPU4S: Major Project Outcomes, Lessons Learnt and Way Forward. Design Automation and Test in Europe Conference (DATE) 2021
[5] David Steenari, Leonidas Kosmidis, Ivan Rodriquez, Alvaro Jover, and Kyra Förster. OBPMark (On-Board Processing Benchmarks) – Open Source Computational Performance Benchmarks for Space Applications. In European Workshop on On-Board Data Processing (OBDP), 2021. [online], Available: https://zenodo.org/record/5638577
[6] OBPMark and GPU4S\_Bench open source repositories. [online] Available: https://obpmark.github.io
[7] https://www.kaggle.com/sorour/95cloud-cloud-segmentation-on-satellite-images (accessed on 1.2.2022)
[8] Johannes Drönner, Nikolaus Korfhage, Sebastian Egli, Markus Mühling, Boris Thies, Jörg Bendix, Bernd Freisleben and Bernhard Seeger. Fast Cloud Segmentation Using Convolutional Neural Networks. Remote Sens. 2018
[9] Junfeng Jing, Zhen Wang, Matthias Rätsch and Huanhuan Zhang. Mobile-Unet: An efficient convolutional neural network for fabric defect detection. Textile Research Journal, 2020.
[10]https://www.kaggle.com/c/airbus-ship-detection/overview
[11] Liu W. et al. SSD: Single Shot MultiBox Detector. Computer Vision – ECCV 2016, 2016.
[12] Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, Ali Farhadi. You Only Look Once: Unified, Real-Time Object Detection. CoRR 2015, 2015.
[13] MobileNetV2: The Next Generation of On-Device Computer Vision Networks, [online] Available: https://ai.googleblog.com/2018/04/mobilenetv2-next-generation-of-on.html.