ATTENUATION OF ACOUSTIC EARLY REFLECTIONS IN TELEVISION STUDIOS USING PRETRAINED SPEECH SYNTHESIS NEURAL NETWORK

Tomer Rosenbaum, Israel Cohen, Emil Winebrand

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Machine learning and digital signal processing have been extensively used to enhance speech. However, methods to reduce early reflections in studio settings are usually related to the physical characteristics of the room. In this paper, we address the problem of early acoustic reflections in television studios and control rooms, and propose a two-stage method that exploits the knowledge of a pretrained speech synthesis generator. First, given a degraded speech signal that includes the direct sound and early reflections, a U-Net convolutional neural network is used to attenuate the early reflections in the spectral domain. Then, a pretrained speech synthesis generator reconstructs the phase to predict an enhanced speech signal in the time domain. Qualitative and quantitative experimental results demonstrate excellent studio quality of speech enhancement.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
Pages7422-7426
Number of pages5
ISBN (Electronic)9781665405409
DOIs
StatePublished - 2022
Event47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Virtual, Online, Singapore
Duration: 23 May 202227 May 2022

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May

Conference

Conference47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityVirtual, Online
Period23/05/2227/05/22

Keywords

  • Acoustic early reflections
  • generative adversarial networks
  • speech dereverberation
  • speech synthesis

All Science Journal Classification (ASJC) codes

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'ATTENUATION OF ACOUSTIC EARLY REFLECTIONS IN TELEVISION STUDIOS USING PRETRAINED SPEECH SYNTHESIS NEURAL NETWORK'. Together they form a unique fingerprint.

Cite this