Issue #89 – Norm-Based Curriculum Learning for Neural Machine Translation
09 Jul20
Issue #89 – Norm-Based Curriculum Learning for Neural Machine Translation
Author: Dr. Patrik Lambert, Senior Machine Translation Scientist @ Iconic
Introduction
Neural machine translation (NMT) models benefit from large amounts of data. However in high resource conditions, training these models is computationally expensive. In this post we take a look at a paper from Liu et al. (2020) aiming at improving the efficiency of training by introducing a curriculum learning method based on the word embedding norm. The results show that the accuracy of the resulting model is also improved (in terms of BLEU score).
Curriculum Learning (CL)
The standard training process consists of randomly sampling sentence pairs to form mini-batches, until all training data has been sampled. This process is called an epoch and is repeated until the training algorithm converges. Each mini-batch is used to update the loss function.
The idea of curriculum learning is to process the examples in different learning stages, from easier to more difficult. To do this we need a criterion to determine that a sentence pair is easy or more difficult to learn from. Typical criteria proposed in the literature for sentence difficulty are linguistically motivated, such as
To finish reading, please visit source site