TY - GEN
T1 - Arabic diacritization with recurrent neural networks
AU - Belinkov, Yonatan
AU - Glass, James
N1 - Publisher Copyright: © 2015 Association for Computational Linguistics.
PY - 2015
Y1 - 2015
N2 - Arabic, Hebrew, and similar languages are typically written without diacritics, leading to ambiguity and posing a major challenge for core language processing tasks like speech recognition. Previous approaches to automatic diacritization employed a variety of machine learning techniques. However, they typically rely on existing tools like morphological analyzers and therefore cannot be easily extended to new genres and languages. We develop a recurrent neural network with long shortterm memory layers for predicting diacritics in Arabic text. Our language-independent approach is trained solely from diacritized text without relying on external tools. We show experimentally that our model can rival state-of-the-art methods that have access to additional resources.
AB - Arabic, Hebrew, and similar languages are typically written without diacritics, leading to ambiguity and posing a major challenge for core language processing tasks like speech recognition. Previous approaches to automatic diacritization employed a variety of machine learning techniques. However, they typically rely on existing tools like morphological analyzers and therefore cannot be easily extended to new genres and languages. We develop a recurrent neural network with long shortterm memory layers for predicting diacritics in Arabic text. Our language-independent approach is trained solely from diacritized text without relying on external tools. We show experimentally that our model can rival state-of-the-art methods that have access to additional resources.
UR - http://www.scopus.com/inward/record.url?scp=84959886759&partnerID=8YFLogxK
U2 - 10.18653/v1/d15-1274
DO - 10.18653/v1/d15-1274
M3 - منشور من مؤتمر
T3 - Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing
SP - 2281
EP - 2285
BT - Conference Proceedings - EMNLP 2015
T2 - Conference on Empirical Methods in Natural Language Processing, EMNLP 2015
Y2 - 17 September 2015 through 21 September 2015
ER -