Deep Learning-based Pitch Estimation and Comb Filter Construction

Type MA thesis

Status finished

Duration Nov 2, 2020 – Apr 30, 2021

Primary supervisors Hendrik Schröter Andreas Maier

Student Jinwei Sun Information und Kommunikationstechnik (IuK)

Typically a clean speech consists of two components, a locally periodic component and a stochastic component. If a speech signal only has a stochastic component, the difference between the enhanced signal applied with the corresponding ideal ratio mask and the clean speech signal is barely perceivable. However, if a speech has a perfect periodic component, then the enhanced signal applied with the corresponding ideal ratio mask is affected by the inter-harmonic noise.
A comb filter based on the speech signal’s pitch period is able to attenuate noise between the pitch harmonics. Thus, a robust pitch estimate is of fundamental importance. In this work, a deep learning-based method for robust pitch estimation in noisy environments will be investigated.