TY - GEN
T1 - Voice separation with an unknown number of multiple speakers
AU - Nachmani, Eliya
AU - Adi, Yossi
AU - Wolf, Lior
N1 - Publisher Copyright: © 2020 37th International Conference on Machine Learning, ICML 2020. All rights reserved.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - We present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
AB - We present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
UR - http://www.scopus.com/inward/record.url?scp=85105253554&partnerID=8YFLogxK
M3 - Conference contribution
T3 - 37th International Conference on Machine Learning, ICML 2020
SP - 7121
EP - 7132
BT - 37th International Conference on Machine Learning, ICML 2020
A2 - Daume, Hal
A2 - Singh, Aarti
T2 - 37th International Conference on Machine Learning, ICML 2020
Y2 - 13 July 2020 through 18 July 2020
ER -