Issue #110 – Better Out of Vocabulary Translation with Bilingual Terminology Mining
03 Dec20 Issue #110 – Better Out of Vocabulary Translation with Bilingual Terminology Mining Author: Akshai Ramesh, Machine Translation Scientist @ Iconic Introduction A significant weakness in conventional neural machine translation (NMT) systems is their inability to correctly translate Out of Vocabulary (OOV) words: end-to-end NMTs tend to have relatively small vocabularies due to memory limitations with a single “unknown token” (usually abbreviated in MT slang as “unk”) that represents every possible out-of-vocabulary (OOV) word. In NMT, byte-pair encoding can […]
Read more