Issue #85 – Applying Terminology Constraints in Neural MT
11 Jun20
Issue #85 – Applying Terminology Constraints in Neural MT
Author: Dr. Chao-Hong Liu, Machine Translation Scientist @ Iconic
Introduction
Maintaining consistency of terminology translation in Neural Machine Translation (NMT) is a more challenging task than in Statistical MT (SMT). In this post, we review a method proposed by Dinu et al. (2019) to train NMT to use custom terminology.
Translation with Terminology Constraints
Applying terminology constraints to translation may appear to be an easy task. It is a common practice in the (human) translation industry, often with the help of translation memory (TM) tools. The idea is that for many domains there is already a set of commonly used terms of translation for a language pair. We don’t want to change these comparatively fixed translations while translating them in that domain.
In SMT, there is the phrase table that can be edited after the training of MT models, therefore it is easier to maintain these constraints. However, in NMT, the models are trained directly from the parallel corpus where the words are further “segmented” into subword units (e.g. using BPE) in the aligned sentences. There is no phrase table to edit as the information is
To finish reading, please visit source site