Abstract

The problem of speaker tracking in noisy and reverberant enclosures is addressed in this paper. We present a hybrid algorithm, combining traditional tracking schemes with a new learning-based approach. A state-space representation, consisting of a propagation and observation models, is learned from signals measured by several distributed microphone pairs. The proposed representation is based on two data modalities corresponding to high-dimensional acoustic features representing the full reverberant acoustic channels as well as low-dimensional time difference of arrival (TDOA) estimates. The state-space representation is accompanied by a statistical model based on a Gaussian process used to relate the variations of the acoustic channels to the physical variations of the associated source positions, thereby forming a data-driven propagation model for the source movement. In the observation model, the source positions are nonlinearly mapped to the associated TDOA readings. The obtained propagation and observation models establish the basis for employing an extended Kalman filter. The simulation results demonstrate the robustness of the proposed method in noisy and reverberant conditions.

Original languageEnglish
Article number8248766
Pages (from-to)725-735
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume26
Issue number4
DOIs
StatePublished - Apr 2018

Keywords

  • Gaussian process
  • Speaker tracking
  • extended Kalman filter (EKF)
  • relative transfer function (RTF)
  • time difference of arrival (TDOA)

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A Hybrid Approach for Speaker Tracking Based on TDOA and Data-Driven Models'. Together they form a unique fingerprint.

Cite this