TY - GEN
T1 - Kernel method for speech source activity detection in multi-modal signals
AU - Dov, David
AU - Talmon, Ronen
AU - Cohen, Israel
N1 - Publisher Copyright: © 2016 IEEE.
PY - 2017/1/4
Y1 - 2017/1/4
N2 - We consider a problem setup, in which a desired speech source is measured by a microphone and by a video camera in an interfering environment. We assume that the interfering sources in the audio signal are independent of the interfering sources in the video signal (e.g., the video signal does not capture the interfering speakers). Our objective in this paper is to detect the activity of the desired source. To address this problem, we take a kernel based geometric approach for obtaining a representation of the measured signal, in which the effect of the interfering sources is reduced. Based on this representation, we devise a measure for the activity of the desired source; experimental results demonstrate its superiority compared to competing methods in the detection of speech signals in the presence of different challenging types of interferences, including interfering speakers in the audio signal.
AB - We consider a problem setup, in which a desired speech source is measured by a microphone and by a video camera in an interfering environment. We assume that the interfering sources in the audio signal are independent of the interfering sources in the video signal (e.g., the video signal does not capture the interfering speakers). Our objective in this paper is to detect the activity of the desired source. To address this problem, we take a kernel based geometric approach for obtaining a representation of the measured signal, in which the effect of the interfering sources is reduced. Based on this representation, we devise a measure for the activity of the desired source; experimental results demonstrate its superiority compared to competing methods in the detection of speech signals in the presence of different challenging types of interferences, including interfering speakers in the audio signal.
KW - Multi-modal signal processing
KW - audio-visual speech activity detection
KW - kernel methods
UR - http://www.scopus.com/inward/record.url?scp=85014253575&partnerID=8YFLogxK
U2 - https://doi.org/10.1109/ICSEE.2016.7806062
DO - https://doi.org/10.1109/ICSEE.2016.7806062
M3 - منشور من مؤتمر
T3 - 2016 IEEE International Conference on the Science of Electrical Engineering, ICSEE 2016
BT - 2016 IEEE International Conference on the Science of Electrical Engineering, ICSEE 2016
T2 - 2016 IEEE International Conference on the Science of Electrical Engineering, ICSEE 2016
Y2 - 16 November 2016 through 18 November 2016
ER -