Juan Camilo Vasquez Correa
Multimodal Assessment of Parkinson’s Disease Patients using Information from Speech, Handwriting, and Gait
The automatic analysis of different bio-signals from patients with Parkinson’s disease is a highly relevant topic that has been addressed by the research community within several years. Identifying bio-markers for early and differential diagnosis, severity assessment, and response to therapy is a primary goal of the research on Parkinson’s disease today. There are important contributions of these topics considering different bio-signals individually. Multimodal analyses, i.e., considering information from different sensors, have not been extensively studied. Although many improvements have been shown in several tasks, there is still an absence of a multimodal system able to deliver an accurate prediction of the disease severity and to monitor the disease progression. The aim of this thesis is to develop robust models for the accurate diagnosis of Parkinson’s disease and to evaluate the disease severity of patients using different bio-signals such as speech, online handwriting, gait (using inertial sensors), and those signals collected from smartphones. The proposed models are evaluated in three application scenarios: (1) The automatic classification of healthy subjects and Parkinson’s patients. (2) The evaluation of the disease severity of the patients based on a clinical scale, including both the motor-state severity and the dysarthria level of the subjects. (3) The classification of patients into different groups according to their disease severity e.g., mild, moderate, and severe. The experiments covered both traditional pattern recognition and novel deep learning models. Three approaches are introduced to model the speech of Parkinson’s patients: (1) phonological analysis of speech, which is more interpretable for clinicians by directly modeling information about the mode and manner of articulation. (2) Representation learning strategies using recurrent autoencoders, which have the potential to extract more abstract and robust features than those traditionally computed. Finally, (3) convolutional neural networks trained to process time-frequency representations of the speech of the patients. Regarding handwriting analysis, the proposed approach involves the computation of traditional kinematic features, combined with novel approaches based on geometric, and in-air features. Deep learning models based on convolutional neural networks are also proposed to evaluate both raw online handwriting data, and the reconstructed offline images created by the patients. The proposed approaches for gait analysis involve the computation of traditional kinematic and spectral features, combined with novel approaches based on non-linear dynamics. A deep-learning approach combining convolutional and recurrent neural networks is also introduced to model the gait signals from the patients. Finally, this thesis covered a multimodal analysis of the speech, handwriting, and gait signals collected from the patients. The addressed experiments are carried out using both early and late fusion strategies. The proposed methods are also evaluated in two scenarios: (1) high quality sensors, which can be available in medical centers for the assessment of patients, and (2) data collected via smartphones, which can be used for continuous monitoring of patients at home. The results indicate that the combined results outperformed those obtained with each bio-signal separately, both for the automatic classification of the disease and the evaluation of the disease severity. In addition, the proposed models are robust to be applied both on signals collected with high-quality sensors and smartphones.