Skip to main navigation Skip to search Skip to main content

Catch-A-Waveform: Learning to Generate Audio from a Single Short Example

Gal Greshler, Tamar Rott Shaham, Tomer Michaeli

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Models for audio generation are typically trained on hours of recordings. Here, we illustrate that capturing the essence of an audio source is typically possible from as little as a few tens of seconds from a single training signal. Specifically, we present a GAN-based generative model that can be trained on one short audio signal from any domain (e.g. speech, music, etc.) and does not require pre-training or any other form of external supervision. Once trained, our model can generate random samples of arbitrary duration that maintain semantic similarity to the training waveform, yet exhibit new compositions of its audio primitives. This enables a long line of interesting applications, including generating new jazz improvisations or new a-cappella rap variants based on a single short example, producing coherent modifications to famous songs (e.g. adding a new verse to a Beatles song based solely on the original recording), filling-in of missing parts (inpainting), extending the bandwidth of a speech signal (super-resolution), and enhancing old recordings without access to any clean training example. We show that in most cases, no more than 20 seconds of training audio suffice for our model to achieve state-of-the-art results. This is despite its complete lack of prior knowledge about the nature of audio signals in general.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
EditorsMarc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan
Pages20916-20928
Number of pages13
ISBN (Electronic)9781713845393
StatePublished - 2021
Event35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online
Duration: 6 Dec 202114 Dec 2021

Publication series

NameAdvances in Neural Information Processing Systems
Volume25

Conference

Conference35th Conference on Neural Information Processing Systems, NeurIPS 2021
CityVirtual, Online
Period6/12/2114/12/21

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Catch-A-Waveform: Learning to Generate Audio from a Single Short Example'. Together they form a unique fingerprint.

Cite this