Paula Andrea Pérez-Toro
Acoustic and Linguistic Analysis in Neurological and Psychiatric Disorders
This thesis investigates the application of speech and language analysis for clinical evaluation and monitoring of neurological and psychiatric disorders, specifically Major Depression Disorder (MDD), Alzheimer’s Disease (AD), and Parkinson’s Disease (PD). Given the symptomatic overlap and diagnostic challenges presented by these conditions, this thesis uses acoustic and linguistic descriptors alongside machine learning models to explore their suitability to disentangle these symptoms. Additionally, the research aims to enhance diagnosis, therapy outcomes, and patient monitoring by differentiating and tracking the progression of symptoms in these disorders. For MDD, the effectiveness of therapy sessions is assessed by analyzing speech dynamics. Three key areas are explored: (1) influence of speech descriptors on therapy dynamics, (2) changes in emotional and speech patterns over time, and (3) predictive power of neural embedding in monitoring changes in depression levels, using contrastive learning. For AD, this thesis examines the automatic assessment of the disease through multiple speech and language approaches aimed at classification, cognitive state prediction, and pre-clinical state detection. This research explores different tasks including: (1) AD classification using machine learning models combining acoustic, emotional, and linguistic features, (2) cognitive state prediction correlating with clinical assessment, and (3) pre-clinical state detection of the PSEN1 mutation, which is associated with AD and is prevalent in the north of Antioquia, Colombia. For PD, the focus is towards the use of speech analysis for the classification and prediction of neurological and motor states. The study includes: (1) classification of PD and prediction of disease severity using color-transformed spectrograms and representation learning and (2) examination of emotional speech descriptors to detect depression in PD patients. Additionally, the thesis addresses potential biases introduced during data collection via interviews and the transferability of speech and language features across different languages, emphasizing the need for robust, multilingual models in clinical diagnostics. Overall, the findings from this thesis demonstrate the potential of speech and language analysis in enhancing diagnostic performance and monitoring treatment progress across several disorders.