Tomás Arias-Vergara
Analysis of Pathological Speech Signals
The present thesis addresses the automatic analysis of speech disorders resulting from Parkinson’s disease and hearing loss.
For Parkinson’s disease, the progression of speech symptoms is evaluated considering speech recordings captured in the short-term (4 months) and long-term (5 years).
Machine learning methods are used to perform three tasks: (1) automatic classification of patients vs. healthy speakers, (2) regression analysis to predict the dysarthria level and neurological state, and (3) speaker embeddings to analyze the progression of the speech symptoms over time.
For hearing loss, automatic acoustic analysis is performed to evaluate whether the duration and onset of deafness (before or after speech acquisition) influence the speech production of cochlear implant users.
Additionally, articulation, prosody, and phonemic analyses show that cochlear implant users present altered speech production even after hearing rehabilitation.
Automatic acoustic analysis is performed considering phonation, articulation, prosody, and phonemic features.
Phoneme precision is characterized using the posterior probabilities obtained from recurrent neural networks trained in German and Spanish. The phonemic analysis considers three main dimensions: manner of articulation, place of articulation, and voicing.
This thesis also proposes a methodology for automatically detecting voice onset time in voiceless stop consonants.
Furthermore, this thesis studies the acoustic cues that reflect changes in elderly people due to the aging process. Regression analysis is performed to estimate a person’s age using phonation, articulation, prosody, and phonemic features.
Additionally, the use of smartphones for healthcare applications is considered here