For speech intelligibility, consonants have a fundamental importance. Unfortunately, when reducing noise in speech, consonants are often also degraded while vocals are easier to preserve/enhance. To improve the detection and enhancement of consonants, we want to use multi-task learning to reduce the noise in the signal and furthermore detect phonemes (smallest acoustic unit in speech).
Preliminary knowledge in deep learning helpful. Also, basic signal processing concepts like sampling theorem and FFT should be present.