Issue #98 – Unified and Multi-encoders for Context-aware Neural MT

10 Sep20

Issue #98 – Unified and Multi-encoders for Context-aware Neural MT

Author: Dr. Patrik Lambert, Senior Machine Translation Scientist @ Iconic

Introduction

Context-aware Neural MT uses context information to perform document-level translation or domain adaptation. The context of surrounding sentences allows the model to capture discourse phenomena. The context of similar sentences can also be useful to dynamically adapt the translation to a domain. In this post, we take a look at two papers which compare uni-encoder and multi-encoder Transformer architectures for context-aware Neural MT – with a focus on document-level MT.

Uni- and Dual-encoder architectures

The following figure (after Ma et al. 2020) illustrates uni-encoder and dual-encoder architectures for context-aware translation with Transformer models.

Ma et al 2020 uni and dual encoder architectures

In dual-encoder Transformer models, a new encoder is added to encode the context information, and the encoder for source sentences is conditioned on this context encoder. The self-attention layers lay within each encoder and thus cannot fully capture the interaction between the contexts and the source sentences. However, in some architectures, the decoder’s attention can

To finish reading, please visit source site