TY - GEN
T1 - Discriminative articulatory models for spoken term detection in low-resource conversational settings
AU - Prabhavalkar, Rohit
AU - Livescu, Karen
AU - Fosler-Lussier, Eric
AU - Keshet, Joseph
PY - 2013/10/18
Y1 - 2013/10/18
N2 - We study spoken term detection (STD) - the task of determining whether and where a given word or phrase appears in a given segment of speech - using articulatory feature-based pronunciation models. The models are motivated by the requirements of STD in low-resource settings, in which it may not be feasible to train a large-vocabulary continuous speech recognition system, as well as by the need to address pronunciation variation in conversational speech. Our STD system is trained to maximize the expected area under the receiver operating characteristic curve, often used to evaluate STD performance. In experimental evaluations on the Switchboard corpus, we find that our approach outperforms a baseline HMM-based system across a number of training set sizes, as well as a discriminative phone-based model in some settings.
AB - We study spoken term detection (STD) - the task of determining whether and where a given word or phrase appears in a given segment of speech - using articulatory feature-based pronunciation models. The models are motivated by the requirements of STD in low-resource settings, in which it may not be feasible to train a large-vocabulary continuous speech recognition system, as well as by the need to address pronunciation variation in conversational speech. Our STD system is trained to maximize the expected area under the receiver operating characteristic curve, often used to evaluate STD performance. In experimental evaluations on the Switchboard corpus, we find that our approach outperforms a baseline HMM-based system across a number of training set sizes, as well as a discriminative phone-based model in some settings.
KW - AUC
KW - articulatory features
KW - discriminative training
KW - spoken term detection
KW - structural SVM
UR - http://www.scopus.com/inward/record.url?scp=84890471529&partnerID=8YFLogxK
U2 - https://doi.org/10.1109/ICASSP.2013.6639281
DO - https://doi.org/10.1109/ICASSP.2013.6639281
M3 - منشور من مؤتمر
SN - 9781479903566
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 8287
EP - 8291
BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
T2 - 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Y2 - 26 May 2013 through 31 May 2013
ER -