TY - GEN
T1 - Semantics-aware Attention Improves Neural Machine Translation
AU - Slobodkin, Aviv
AU - Choshen, Leshem
AU - Abend, Omri
N1 - Publisher Copyright: © 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - The integration of syntactic structures into Transformer machine translation has shown positive results, but to our knowledge, no work has attempted to do so with semantic structures. In this work we propose two novel parameter-free methods for injecting semantic information into Transformers, both rely on semantics-aware masking of (some of) the attention heads. One such method operates on the encoder, through a Scene-Aware Self-Attention (SASA) head. Another on the decoder, through a Scene-Aware Cross-Attention (SACrA) head. We show a consistent improvement over the vanilla Transformer and syntax-aware models for four language pairs. We further show an additional gain when using both semantic and syntactic structures in some language pairs.
AB - The integration of syntactic structures into Transformer machine translation has shown positive results, but to our knowledge, no work has attempted to do so with semantic structures. In this work we propose two novel parameter-free methods for injecting semantic information into Transformers, both rely on semantics-aware masking of (some of) the attention heads. One such method operates on the encoder, through a Scene-Aware Self-Attention (SASA) head. Another on the decoder, through a Scene-Aware Cross-Attention (SACrA) head. We show a consistent improvement over the vanilla Transformer and syntax-aware models for four language pairs. We further show an additional gain when using both semantic and syntactic structures in some language pairs.
UR - http://www.scopus.com/inward/record.url?scp=85139097947&partnerID=8YFLogxK
U2 - 10.18653/v1/2022.starsem-1.3
DO - 10.18653/v1/2022.starsem-1.3
M3 - منشور من مؤتمر
T3 - *SEM 2022 - 11th Joint Conference on Lexical and Computational Semantics, Proceedings of the Conference
SP - 28
EP - 43
BT - *SEM 2022 - 11th Joint Conference on Lexical and Computational Semantics, Proceedings of the Conference
A2 - Nastase, Vivi
A2 - Pavlick, Ellie
A2 - Pilehvar, Mohammad Taher
A2 - Camacho-Collados, Jose
A2 - Raganato, Alessandro
PB - Association for Computational Linguistics (ACL)
T2 - 11th Joint Conference on Lexical and Computational Semantics, *SEM 2022
Y2 - 14 July 2022 through 15 July 2022
ER -