Issue #26 – Context and Copying in Neural MT
21 Feb19
Issue #26 – Context and Copying in Neural MT
Author: Raj Patel, Machine Translation Scientist @ Iconic
When translating from one language to another, certain words and tokens need to be copied, and not translated, per se, in the target sentence. This includes things like proper nouns, names, numbers, and ‘unknown’ tokens. We want these to appear in the translation just as they were in the original text. Neural MT systems with subword vocabulary are capable of copying or translating these (unknown) tokens . Studies suggest that a neural model also learns to copy a “copy-prone” source token which it has learned to translate. In this post we will try to understand the copying behaviour of Neural MT and see if contexts play any role in deciding whether to copy or translate.
Copying in Neural MT
In Neural MT, copying is more of a challenge compared to Statistical MT due to subword vocabulary and soft attention rather than strict alignment. In NMT literature, generally, copying is performed by post-processing and/or modifying the network architecture.
Neural MT models that use the subword vocabulary to perform open vocabulary translation often translate/copy the tokens even when the full word
To finish reading, please visit source site