TY - GEN
T1 - Direction Of Arrival Estimation For Reverberant Speech Based On Neural Networks And The Direct-Path Dominance Test
AU - Zaken, Orel Ben
AU - Rafaely, Boaz
AU - Kumar, Anurag
AU - Tourbabin, Vladimir
N1 - Publisher Copyright: © 2022 IEEE.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - In reverberant environments, typical of real-world scenarios, direction of arrival (DOA) estimation for speech sources appears to be a challenging problem in audio signal processing. An effective way of overcoming this challenge is to perform a direct-path dominance (DPD) test. The DPD test identifies time frequency bins dominated by the direct sound and holds accurate DOA data. In recent years, methods based on neural networks (NN) have been developed to estimate DOA. Based on the latter approach, this work proposes a NN based method, for spherical arrays, that is a generalization of the original DPD test method and aims to improve its performance by utilizing new information in the data, while preserving its advantages. This article presents the results of the proposed method for a single speaker in a room, and analyzes which features contain useful information about the direct sound by evaluating performance for simulated data.
AB - In reverberant environments, typical of real-world scenarios, direction of arrival (DOA) estimation for speech sources appears to be a challenging problem in audio signal processing. An effective way of overcoming this challenge is to perform a direct-path dominance (DPD) test. The DPD test identifies time frequency bins dominated by the direct sound and holds accurate DOA data. In recent years, methods based on neural networks (NN) have been developed to estimate DOA. Based on the latter approach, this work proposes a NN based method, for spherical arrays, that is a generalization of the original DPD test method and aims to improve its performance by utilizing new information in the data, while preserving its advantages. This article presents the results of the proposed method for a single speaker in a room, and analyzes which features contain useful information about the direct sound by evaluating performance for simulated data.
KW - Speaker localization
KW - machine learning
KW - spherical arrays
UR - http://www.scopus.com/inward/record.url?scp=85141342184&partnerID=8YFLogxK
U2 - 10.1109/IWAENC53105.2022.9914696
DO - 10.1109/IWAENC53105.2022.9914696
M3 - Conference contribution
T3 - International Workshop on Acoustic Signal Enhancement, IWAENC 2022 - Proceedings
BT - International Workshop on Acoustic Signal Enhancement, IWAENC 2022 - Proceedings
T2 - 17th International Workshop on Acoustic Signal Enhancement, IWAENC 2022
Y2 - 5 September 2022 through 8 September 2022
ER -