Abstract
This paper addresses the problem of speech separation and enhancement from multichannel convolutive and noisy mixtures, assuming known mixing filters. We propose to perform speech separation and enhancement in the short-time Fourier transform domain using the convolutive transfer function (CTF) approximation. Compared to time-domain filters, the CTF has much less taps. Consequently, it requires less computational cost and sometimes is more robust against the filter perturbations. We propose three methods: 1) for the multisource case, the multichannel inverse filtering method, i.e., the multiple input/output inverse theorem (MINT), is exploited in the CTF domain; 2) a beamforming-like multichannel inverse filtering method applying the single-source MINT and using power minimization, which is suitable whenever the source CTFs are not all known; and 3) a basis pursuit method, where the sources are recovered by minimizing their ℓ 1 -norm to impose spectral sparsity, while the ℓ 2 -norm fitting cost between microphone signals and mixing model is constrained to be lower than a tolerance. The noise can be reduced by setting this tolerance at the noise power level. Experiments under various acoustic conditions are carried out to evaluate and compare the three proposed methods. Comparison with four baseline methods - beamforming-based, two time-domain inverse filters, and time-domain Lasso - shows the applicability of the proposed methods.
Original language | English |
---|---|
Article number | 8610134 |
Pages (from-to) | 645-659 |
Number of pages | 15 |
Journal | IEEE/ACM Transactions on Audio Speech and Language Processing |
Volume | 27 |
Issue number | 3 |
DOIs | |
State | Published - Mar 2019 |
Keywords
- Audio source separation
- Lasso optimization
- MINT
- convolutive transfer function
- short-time Fourier transform
- speech enhancement
All Science Journal Classification (ASJC) codes
- Computer Science (miscellaneous)
- Acoustics and Ultrasonics
- Computational Mathematics
- Electrical and Electronic Engineering