Abstract
The problem of blind audio source separation (BASS) in noisy and reverberant conditions is addressed by a novel approach, termed Global and LOcal Simplex Separation (GLOSS), which integrates full- and narrow-band simplex representations. We show that the eigenvectors of the correlation matrix between time frames in a certain frequency band form a simplex that organizes the frames according to the speaker activities in the corresponding band. We propose to build two simplex representations: One global based on a broad frequency band and one local based on a narrow band. In turn, the two representations are combined to determine the dominant speaker in each time-frequency (TF) bin. Using the identified dominating speakers, a spectral mask is computed and is utilized for extracting each of the speakers using spatial beamforming followed by spectral postfiltering. The performance of the proposed algorithm is demonstrated using real-life recordings in various noisy and reverberant conditions.
Original language | English |
---|---|
Article number | 9004553 |
Pages (from-to) | 914-928 |
Number of pages | 15 |
Journal | IEEE/ACM Transactions on Audio Speech and Language Processing |
Volume | 28 |
DOIs | |
State | Published - 2020 |
Keywords
- Blind audio source separation (BASS)
- beamformer
- relative transfer function (RTF)
- simplex
- spectral mask
All Science Journal Classification (ASJC) codes
- Computer Science (miscellaneous)
- Acoustics and Ultrasonics
- Computational Mathematics
- Electrical and Electronic Engineering