Array Configuration Mismatch in Deep DOA Estimation: Towards Robust Training

Ayal Schwartz, Elior Hadad, Sharon Gannot, Shlomo E. Chazan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deep direction of arrival (DOA) models commonly require a perfect match between the array configurations in the training and test stages and consequently cannot be applied to unfamiliar microphone array constellations. In this paper, we present a deep DOA estimation method that circumvents this requirement. In our approach, we first cast the DOA estimation as a classification problem in each time-frequency (TF) bin, thus facilitating the localization of multiple concurrent speakers. We utilize a high-resolution spatial image, based on a narrow-band variant of the steered response power phase transform (SRP-PHAT) processor, as an input feature. The model is trained with simulated data using a single microphone array configuration in various acoustic conditions. In the test stage, the algorithm is applied with unfamiliar microphone array constellations, namely with a different number of microphones and inter-distances. An elaborated experimental study with real-life room impulse response (RIR) recordings demonstrates the effectiveness of the proposed input feature and the training scheme. Our approach achieves comparable results in familiar microphone array constellations and, more importantly, can accurately estimate the DOA of multiple concurrent speakers even with unfamiliar microphone arrays.

Original languageEnglish
Title of host publicationProceedings of the 2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350323726
DOIs
StatePublished - 2023
Event2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023 - New Paltz, United States
Duration: 22 Oct 202325 Oct 2023

Publication series

NameIEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Volume2023-October

Conference

Conference2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2023
Country/TerritoryUnited States
CityNew Paltz
Period22/10/2325/10/23

Keywords

  • Deep DOA
  • SRP-PHAT

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Array Configuration Mismatch in Deep DOA Estimation: Towards Robust Training'. Together they form a unique fingerprint.

Cite this