Supervised non-euclidean sparse NMF via bilevel optimization with applications to speech enhancement

Pablo Sprechmann, Alex M. Bronstein, Guillermo Sapiro

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Traditionally, NMF algorithms consist of two separate stages: a training stage, in which a generative model is learned; and a testing stage in which the pre-learned model is used in a high level task such as enhancement, separation, or classification. As an alternative, we propose a task-supervised NMF method for the adaptation of the basis spectra learned in the first stage to enhance the performance on the specific task used in the second stage. We cast this problem as a bilevel optimization program that can be efficiently solved via stochastic gradient descent. The proposed approach is general enough to handle sparsity priors of the activations, and allow non-Euclidean data terms such as β-divergences. The framework is evaluated on single-channel speech enhancement tasks.

Original languageEnglish
Title of host publication2014 4th Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, HSCMA 2014
Pages11-15
Number of pages5
DOIs
StatePublished - 2014
Externally publishedYes
Event2014 4th Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, HSCMA 2014 - Villers-les-Nancy, France
Duration: 12 May 201414 May 2014

Publication series

Name2014 4th Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, HSCMA 2014

Conference

Conference2014 4th Joint Workshop on Hands-Free Speech Communication and Microphone Arrays, HSCMA 2014
Country/TerritoryFrance
CityVillers-les-Nancy
Period12/05/1414/05/14

Keywords

  • NMF
  • Supervised learning
  • bilevel
  • speech enhancement
  • tast-specific learning

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Supervised non-euclidean sparse NMF via bilevel optimization with applications to speech enhancement'. Together they form a unique fingerprint.

Cite this