Kernel method for speech source activity detection in multi-modal signals

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We consider a problem setup, in which a desired speech source is measured by a microphone and by a video camera in an interfering environment. We assume that the interfering sources in the audio signal are independent of the interfering sources in the video signal (e.g., the video signal does not capture the interfering speakers). Our objective in this paper is to detect the activity of the desired source. To address this problem, we take a kernel based geometric approach for obtaining a representation of the measured signal, in which the effect of the interfering sources is reduced. Based on this representation, we devise a measure for the activity of the desired source; experimental results demonstrate its superiority compared to competing methods in the detection of speech signals in the presence of different challenging types of interferences, including interfering speakers in the audio signal.

Original languageEnglish
Title of host publication2016 IEEE International Conference on the Science of Electrical Engineering, ICSEE 2016
ISBN (Electronic)9781509021529
DOIs
StatePublished - 4 Jan 2017
Event2016 IEEE International Conference on the Science of Electrical Engineering, ICSEE 2016 - Eilat, Israel
Duration: 16 Nov 201618 Nov 2016

Publication series

Name2016 IEEE International Conference on the Science of Electrical Engineering, ICSEE 2016

Conference

Conference2016 IEEE International Conference on the Science of Electrical Engineering, ICSEE 2016
Country/TerritoryIsrael
CityEilat
Period16/11/1618/11/16

Keywords

  • Multi-modal signal processing
  • audio-visual speech activity detection
  • kernel methods

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Hardware and Architecture
  • Artificial Intelligence
  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Kernel method for speech source activity detection in multi-modal signals'. Together they form a unique fingerprint.

Cite this