Direction Of Arrival Estimation For Reverberant Speech Based On Neural Networks And The Direct-Path Dominance Test

Orel Ben Zaken, Boaz Rafaely, Anurag Kumar, Vladimir Tourbabin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In reverberant environments, typical of real-world scenarios, direction of arrival (DOA) estimation for speech sources appears to be a challenging problem in audio signal processing. An effective way of overcoming this challenge is to perform a direct-path dominance (DPD) test. The DPD test identifies time frequency bins dominated by the direct sound and holds accurate DOA data. In recent years, methods based on neural networks (NN) have been developed to estimate DOA. Based on the latter approach, this work proposes a NN based method, for spherical arrays, that is a generalization of the original DPD test method and aims to improve its performance by utilizing new information in the data, while preserving its advantages. This article presents the results of the proposed method for a single speaker in a room, and analyzes which features contain useful information about the direct sound by evaluating performance for simulated data.

Original languageAmerican English
Title of host publicationInternational Workshop on Acoustic Signal Enhancement, IWAENC 2022 - Proceedings
ISBN (Electronic)9781665468671
DOIs
StatePublished - 1 Jan 2022
Event17th International Workshop on Acoustic Signal Enhancement, IWAENC 2022 - Bamberg, Germany
Duration: 5 Sep 20228 Sep 2022

Publication series

NameInternational Workshop on Acoustic Signal Enhancement, IWAENC 2022 - Proceedings

Conference

Conference17th International Workshop on Acoustic Signal Enhancement, IWAENC 2022
Country/TerritoryGermany
CityBamberg
Period5/09/228/09/22

Keywords

  • Speaker localization
  • machine learning
  • spherical arrays

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Acoustics and Ultrasonics

Fingerprint

Dive into the research topics of 'Direction Of Arrival Estimation For Reverberant Speech Based On Neural Networks And The Direct-Path Dominance Test'. Together they form a unique fingerprint.

Cite this