Value-aware Approximate Attention

Ankit Gupta, Jonathan Berant

פרסום מחקרי: פרק בספר / בדוח / בכנספרסום בספר כנסביקורת עמיתים

תקציר

Following the success of dot-product attention in Transformers, numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length. However, all approximations thus far have ignored the contribution of the value vectors to the quality of approximation. In this work, we argue that research efforts should be directed towards approximating the true output of the attention sub-layer, which includes the value vectors. We propose a value-aware objective, and show theoretically and empirically that an optimal approximation of a value-aware objective substantially outperforms an optimal approximation that ignores values, in the context of language modeling. Moreover, we show that the choice of kernel function for computing attention similarity can substantially affect the quality of sparse approximations, where kernel functions that are less skewed are more affected by the value vectors.

שפה מקוריתאנגלית
כותר פרסום המארחEMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
מוציא לאורAssociation for Computational Linguistics (ACL)
עמודים9567-9574
מספר עמודים8
מסת"ב (אלקטרוני)9781955917094
סטטוס פרסוםפורסם - 2021
אירוע2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 - Virtual, Punta Cana, !!Dominican Republic
משך הזמן: 7 נוב׳ 202111 נוב׳ 2021

סדרות פרסומים

שםEMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings

כנס

כנס2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021
מדינה/אזור!!Dominican Republic
עירVirtual, Punta Cana
תקופה7/11/2111/11/21

ASJC Scopus subject areas

  • ???subjectarea.asjc.1700.1703???
  • ???subjectarea.asjc.1700.1706???
  • ???subjectarea.asjc.1700.1710???

טביעת אצבע

להלן מוצגים תחומי המחקר של הפרסום 'Value-aware Approximate Attention'. יחד הם יוצרים טביעת אצבע ייחודית.

פורמט ציטוט ביבליוגרפי