Voice activity detection in presence of transient noise using spectral clustering and diffusion kernels

Oren Rosen, Saman Mousazadeh, Israel Cohen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we introduce a voice activity detection (VAD) algorithm based on spectral clustering and diffusion kernels. The proposed algorithm is a supervised learning algorithm comprising of learning and testing stages: A sample cloud is produced for every signal frame by utilizing a moving window. Mel-frequency cepstrum coefficients (MFCCs) are then calculated for every sample in the cloud in order to produce an MFCC matrix and subsequently a covariance matrix for every frame. Utilizing the covariance matrix, we calculate a similarity matrix using spectral clustering and diffusion kernels methods. Using the similarity matrix, we cluster the data and transform it to a new space where each point is labeled as speech or nonspeech. We then use a Gaussian Mixture Model (GMM) in order to build a statistical model for labeling data as speech or nonspeech. Simulation results demonstrate its advantages compared to a recent VAD algorithm.

Original languageEnglish
Title of host publication2014 IEEE 28th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2014
ISBN (Electronic)9781479959877
DOIs
StatePublished - 2014
Event2014 28th IEEE Convention of Electrical and Electronics Engineers in Israel, IEEEI 2014 - Eilat, Israel
Duration: 3 Dec 20145 Dec 2014

Publication series

Name2014 IEEE 28th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2014

Conference

Conference2014 28th IEEE Convention of Electrical and Electronics Engineers in Israel, IEEEI 2014
Country/TerritoryIsrael
CityEilat
Period3/12/145/12/14

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Voice activity detection in presence of transient noise using spectral clustering and diffusion kernels'. Together they form a unique fingerprint.

Cite this