Voice Activity Detector

Master Thesis Nikola Horníková

Supervisor

RNDr. Marek Nagy, PhD.


Anotation

Detektor hlasovej aktivity má mnohostranné využitie. Vo zvukovom signále identifikuje pozície nahrávky hlasu a neželaného šumu (ticha). Pomôže to napríklad redukovať dátové prenosy pri audiokonferenčných aplikáciách. Prínos je aj pre rozpoznávač reči, ktorý dostane na rozpoznanie menšie úseky záznamu, čím sa zmenší jeho chybovosť.


Goal

Vytvoriť algoritmus detekcie hlasu v zvukovej nahrávke v aplikácii Octave(Matlab). Treba brať zreteľ, že sa VAD bude využívať v reálnom čase. Následne tento algoritmus efektívne prepísať do Javascriptu, aby sa dal použiť ako modul do webovej aplikácie s audiokonferenčnou možnosťou.


Main Chapters


Time schedule

October 2020 : Digital Signal Processing

November 2020 : Prepare and study resources

December 2020 : Implement the first resource

January 2021 : Research datasets

February 2021 : Implement mixing of recording

March 2021 : Run experiments

April 2021 : Implement results evaluation

May 2021 : Evaluate experiments

10.5.2021 : Create website and presentation

November 2021 : Implement second resource

November 2021 : Run experiments

December 2021 : Prepare for presentation

January 2022 : Evaluate experiments

February 2022 : Propose, choose a solution

March 2022 : Implement the solution

April 2022 : Evaluate the solution

May 2022 : Final touches, presentation

Resources

Type Title Author(s)
Article Formant-Based Robust Voice Activity Detection I.Yoo, H.Lim, D.Yook
Article Vowel formants compared with resonances of the vocal tract Aalto Daniel, Huhtala Antti, Kivelä A., Malinen Jarmo, Palo Pertti, Saunavaara Jani, Vainio Martti
Dataset English multi-speaker corpus for CSTR voice cloning toolkit Yamagishi Junichi
Dataset MUSAN: A Music, Speech, and Noise Corpus David Snyder and Guoguo Chen and Daniel Povey
Article Signal-to-noise ratio (SNR) as a measure of reproducibility: Design, estimation, and application Elkum, Naser and Shoukri, Mohamed
Article Speech enhancement for non-stationary noise environments Cohen, Israel and Berdugo, Baruch
Article Noise power spectral density estimation based on optimal smoothing and minimum statistics Martin, R.
Article Computationally Efficient Speech Enhancement By Spectral Minima Tracking In Subbands Gerhard Doblinger
Article Assessing local noise level estimation methods: Application to noise robust ASR Christophe Ris and Stéphane Dupont
Article Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator Ephraim, Y. and Malah, D.
Article Speech enhancement using a soft-decision noise suppression filter McAulay, R. and Malpass, M.


Current Version


Theory

Check it out here!

Application

Code is here! (Datasets are not present)

Presentation

Presentation can be viewed here!