

Our research focuses on modeling speech and language patterns using machine and deep learning methods. We develop spoken dialogue systems, enhance speech, and process out-of-vocabulary words. We analyze prosodic features such as accents and phrase boundaries, and automatically recognize emotion-related states using multi-modal data, including facial expressions, gestures, and physiological parameters. We also recognize user focus in human-machine interactions and analyze pathological speech from children with cleft lip and palate or patients with speech and language disorders.
Additionally, our work extends to analyzing animal speech (e.g., such as the one from orcas) aiming to interpret communication patterns in zoos and the wild.
For the natural language processing field, we develop and apply methods like Large Language Models (LLMs), topic modeling, and part-of-speech tagging, with applications in both medical and industrial domains. We also leverage LLMs and deep learning for advanced speech and language understanding, addressing ethical AI, text summarization, and question/answering systems.
- , , , :
DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
INTERSPEECH (Dublin, Ireland, August 20, 2023 - August 24, 2023)
In: INTERSPEECH 2023 2023
Open Access: https://arxiv.org/abs/2305.08227
BibTeX: Download - , , , , , , , , , , , , :
Automatic Assessment of Alzheimer's across Three Languages Using Speech and Language Features
24th International Speech Communication Association, Interspeech 2023 (Dublin, IRL, August 20, 2023 - August 24, 2023)
In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2023
DOI: 10.21437/Interspeech.2023-2079
BibTeX: Download - , , , , , , , , :
ORCA-SPY: Killer Whale Sound Source Simulation and Detection, Classification and Localization in PAMGuard Utilizing Integrated Deep Learning Based Segmentation
In: Scientific Reports UNDER REVIEW (2023)
ISSN: 2045-2322
BibTeX: Download