Voice activity detection based on statistical likelihood ratio with adaptive thresholding

Xiaofei Li, Radu Horaud, Laurent Girin, Sharon Gannot

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Statistical likelihood ratio test is a widely used voice activity detection (VAD) method, in which the likelihood ratio of the current temporal frame is compared with a threshold. A fixed threshold is always used, but this is not suitable for various types of noise. In this paper, an adaptive threshold is proposed as a function of the local statistics of the likelihood ratio. This threshold represents the upper bound of the likelihood ratio for the non-speech frames, whereas it remains generally lower than the likelihood ratio for the speech frames. As a result, a high non-speech hit rate can be achieved, while maintaining speech hit rate as large as possible.

Original languageEnglish
Title of host publication2016 International Workshop on Acoustic Signal Enhancement, IWAENC 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781509020072
DOIs
StatePublished - 19 Oct 2016
Event15th International Workshop on Acoustic Signal Enhancement, IWAENC 2016 - Xi'an, China
Duration: 13 Sep 201616 Sep 2016

Publication series

Name2016 International Workshop on Acoustic Signal Enhancement, IWAENC 2016

Conference

Conference15th International Workshop on Acoustic Signal Enhancement, IWAENC 2016
Country/TerritoryChina
CityXi'an
Period13/09/1616/09/16

Keywords

  • Adaptive threshold
  • High non-speech hit rate
  • Likelihood ratio test
  • Voice activity detection

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Acoustics and Ultrasonics

Fingerprint

Dive into the research topics of 'Voice activity detection based on statistical likelihood ratio with adaptive thresholding'. Together they form a unique fingerprint.

Cite this