Context-Aware Self-Attention Networks

Yang, Baosong; Li, Jian; Wong, Derek F.; Chao, Lidia S.; Wang, Xing; Tu, Zhaopeng

Repository landing page

oai:ojs.aaai.org:article/3809

Context-Aware Self-Attention Networks

Authors: Baosong Yang
Jian Li
Derek F. Wong
Lidia S. Chao
Xing Wang
Zhaopeng Tu
Publication date: 17 July 2019
Publisher: Association for the Advancement of Artificial Intelligence
Doi

Abstract

Self-attention model has shown its flexibility in parallel computation and the effectiveness on modeling both long- and short-term dependencies. However, it calculates the dependencies between representations without considering the contextual information, which has proven useful for modeling dependencies among neural representations in various natural language tasks. In this work, we focus on improving self-attention networks through capturing the richness of context. To maintain the simplicity and flexibility of the self-attention networks, we propose to contextualize the transformations of the query and key layers, which are used to calculate the relevance between elements. Specifically, we leverage the internal representations that embed both global and deep contexts, thus avoid relying on external resources. Experimental results on WMT14 English⇒German and WMT17 Chinese⇒English translation tasks demonstrate the effectiveness and universality of the proposed methods. Furthermore, we conducted extensive analyses to quantify how the context vectors participate in the self-attention model

Similar works

Full text

Association for the Advancement of Artificial Intelligence: AAAI Publications

oai:ojs.aaai.org:article/3809

Last time updated on 30/11/2020

This paper was published in Association for the Advancement of Artificial Intelligence: AAAI Publications.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.