Issue #98 – Unified and Multi-encoders for Context-aware Neural MT
10 Sep20
Issue #98 – Unified and Multi-encoders for Context-aware Neural MT
Author: Dr. Patrik Lambert, Senior Machine Translation Scientist @ Iconic
Introduction
Context-aware Neural MT uses context information to perform document-level translation or domain adaptation. The context of surrounding sentences allows the model to capture discourse phenomena. The context of similar sentences can also be useful to dynamically adapt the translation to a domain. In this post, we take a look at two papers which compare uni-encoder and multi-encoder Transformer architectures for context-aware Neural MT – with a focus on document-level MT.
Uni- and Dual-encoder architectures
The following figure (after Ma et al. 2020) illustrates uni-encoder and dual-encoder architectures for context-aware translation with Transformer models.
In dual-encoder Transformer models, a new encoder is added to encode the context information, and the encoder for source sentences is conditioned on this context encoder. The self-attention layers lay within each encoder and thus cannot fully capture the interaction between the contexts and the source sentences. However, in some architectures, the decoder’s attention can