Abstract
Sign language translation (SLT) is often decomposed into video-to-gloss recognition and gloss-to-text translation, where a gloss is a sequence of transcribed spoken-language words in the order in which they are signed. We focus here on gloss-to-text translation, which we treat as a low-resource neural machine translation (NMT) problem. However, unlike traditional low-resource NMT, gloss-to-text translation differs because gloss-text pairs often have a higher lexical overlap and lower syntactic overlap than pairs of spoken languages. We exploit this lexical overlap and handle syntactic divergence by proposing two rule-based heuristics that generate pseudo-parallel gloss-text pairs from monolingual spoken language text. By pre-training on this synthetic data, we improve translation from American Sign Language (ASL) to English and German Sign Language (DGS) to German by up to 3.14 and 2.20 BLEU, respectively.
| Original language | English |
|---|---|
| Pages | 1-11 |
| Number of pages | 11 |
| State | Published - 1 Jan 2021 |
| Event | 1st International Workshop on Automatic Translation for Signed and Spoken Languages, AT4SSL 2021 - Virtual, Online, United States Duration: 16 Aug 2021 → 20 Aug 2021 |
Conference
| Conference | 1st International Workshop on Automatic Translation for Signed and Spoken Languages, AT4SSL 2021 |
|---|---|
| Country/Territory | United States |
| City | Virtual, Online |
| Period | 16/08/21 → 20/08/21 |
All Science Journal Classification (ASJC) codes
- Language and Linguistics
- Artificial Intelligence
- Software
- Linguistics and Language