Domain Adaptation Using Suitable Pseudo Labels for Speech Enhancement and Dereverberation

Lior Frenkel, Shlomo E. Chazan, Jacob Goldberger

Research output: Contribution to journalArticlepeer-review

Abstract

Speech enhancement and dereverberation approaches based on neural networks are designed to learn a transformation from noisy to clean speech using supervised learning. However, networks trained in this way may fail to effectively handle languages, types of noise, or acoustic environments that were not included in the training data. To tackle this issue, the present study centers around unsupervised domain adaptation, specifically addressing scenarios characterized by substantial domain gaps. In this scenario, we have noisy speech data from the new domain, but the corresponding clean speech data is unavailable. We propose an adaptation method based on domain-adversarial training followed by iterative self-training, where the estimated speech is used as pseudo labels, and the target samples are gradually introduced to the network based on their similarity to the source domain. The self-training also utilizes labeled samples from the source domain which are similar to the target domain. The experimental results show that our method effectively mitigates the domain mismatch between the training and test sets, thus outperforming the current baselines.

Original languageEnglish
Pages (from-to)1226-1236
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume32
DOIs
StatePublished - 2024

Keywords

  • Unsupervised domain adaptation
  • dereverberation
  • pseudo labels
  • self-training
  • speech enhancement

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Computational Mathematics
  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Fingerprint

Dive into the research topics of 'Domain Adaptation Using Suitable Pseudo Labels for Speech Enhancement and Dereverberation'. Together they form a unique fingerprint.

Cite this