Switching Kronecker Product Linear Filtering for Multispeaker Adaptive Speech Dereverberation

Gongping Huang, Jacob Benesty, Israel Cohen, Emil Winebrand, Jingdong Chen, Walter Kellermann

Research output: Contribution to journalConference articlepeer-review

Abstract

Dereverberation, a process to mitigate or eliminate the reverberation effect, plays an important role in hands-free speech communication and human-machine interfaces. Tremendous efforts have been devoted to this problem and various methods have been developed over the last three decades. Those methods generally assume that there is only a single speaker in the acoustic environment and, consequently, they suffer from significant performance degradation if multiple speakers participate in the conversation. How to deal with reverberation in multiple-speaker scenarios is still a challenging problem, which is studied in this work. We present a switching multichannel linear prediction filtering method, which designs multiple linear filters with each tracking one speaker. When some speaker is active, the corresponding filter and the weighted cross-correlation matrix are updated while the other filters are kept unchanged. To further improve the performance and reduce complexity, we apply the Kronecker product to decompose every linear prediction filter into a Kronecker product of two shorter filters: one is time-invariant and the other is time-varying. The former is estimated with a batch method (using only a few seconds of speech signal when the corresponding speaker starts to talk in the entire conversation) while a recursive least-squares algorithm is derived for identifying the time-varying set of Kronecker filters.

Keywords

  • Dereverberation
  • Kronecker product
  • linear prediction
  • switching filter
  • weighted-prediction-error

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Switching Kronecker Product Linear Filtering for Multispeaker Adaptive Speech Dereverberation'. Together they form a unique fingerprint.

Cite this