Abstract
Blind source separation is addressed, using a novel data-driven approach, based on a well-established probabilistic model. The proposed method is specifically designed for separation of multichannel audio mixtures. The algorithm relies on spectral decomposition of the correlation matrix between different time frames. The probabilistic model implies that the column space of the correlation matrix is spanned by the probabilities of the various speakers across time. The number of speakers is recovered by the eigenvalue decay, and the eigenvectors form a simplex of the speakers' probabilities. Time frames dominated by each of the speakers are identified exploiting convex geometry tools on the recovered simplex. The mixing acoustic channels are estimated utilizing the identified sets of frames, and a linear umixing is performed to extract the individual speakers. The derived simplexes are visually demonstrated for mixtures of two, three, and four speakers. We also conduct a comprehensive experimental study, showing high separation capabilities in various reverberation conditions.
Original language | English |
---|---|
Article number | 8493325 |
Pages (from-to) | 6458-6473 |
Number of pages | 16 |
Journal | IEEE Transactions on Signal Processing |
Volume | 66 |
Issue number | 24 |
DOIs | |
State | Published - 15 Dec 2018 |
Keywords
- Blind audio source separation (BASS)
- relative transfer function (RTF)
- simplex
- spectral decomposition
All Science Journal Classification (ASJC) codes
- Signal Processing
- Electrical and Electronic Engineering