Abstract
Measuring how well human listeners recognize speech under varying environmental conditions (speech intelligibility) is a challenge for theoretical, technological, and clinical approaches to speech communication. The current gold standard—human transcription—is time- and resource-intensive. Recent advances in automatic speech recognition (ASR) systems raise the possibility of automating intelligibility measurement. This study tested 4 state-of-the-art ASR systems with second language speech-in-noise and found that one, whisper, performed at or above human listener accuracy. However, the content of whisper's responses diverged substantially from human responses, especially at lower signal-to-noise ratios, suggesting both opportunities and limitations for ASR-based speech intelligibility modeling.
| Original language | English |
|---|---|
| Article number | 025204 |
| Journal | JASA Express Letters |
| Volume | 4 |
| Issue number | 2 |
| DOIs | |
| State | Published - 1 Feb 2024 |
All Science Journal Classification (ASJC) codes
- Acoustics and Ultrasonics
- Music
- Arts and Humanities (miscellaneous)
Fingerprint
Dive into the research topics of 'Automatic recognition of second language speech-in-noise'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver