Issue #12 – Character-based Neural MT
04 Oct18
Issue #12 – Character-based Neural MT
Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic
Most flavours of Machine Translation naturally use the word as the basis for learning models. Early work on Neural MT that followed this approach had to limit the vocabulary scope for practical reasons.. This created problems when dealing with out-of-vocabulary words. One approach that was explored to solve this problem was character-based Neural MT. With the emergence of subword approaches, which almost solves the out-of-vocabulary issue, the interest in character-based models declined. However, there has been renewed interest recently, with some papers showing that character-based NMT may be a promising research avenue, especially in low-resource conditions.
Characters and Sub-words
The results obtained by Cherry et al. (2018) are straightforward to apply to any NMT setting because – unlike in most character-based models – they used the same engine as for (sub)word-based NMT, without adapting the architecture to the character-based scenario. Translating characters instead of subwords improves generalisation and simplifies the model through a dramatic reduction of the vocabulary. However, it also implies dealing with much longer sequences, which presents significant modelling and computational challenges for sequence-to-sequence neural models.
To finish reading, please visit source site