Machine Translation Weekly 53: Code Swithing Pre-training for NMT

After a short break, MT weekly is again here, and today I will talk about a
paper “CSP: Code-Switching Pre-training for Neural Machine
Translation”
that will appear at this
year’s virtual EMNLP
. The paper proposes a new and
surprisingly elegant way of monolingual pre-training for both supervised and
unsupervised neural machine translation.

The idea is quite simple. The model they use is the standard Transformer; all
the magic is how the model is trained. First, it is pre-trained on synthetic
“half-translations” created from monolingual data. It is then trained either on
parallel data in a supervised way or by iterative back-translation on
monolingual data in the unsupervised setup. What I just called the
“half-translations” are sentences where some of the words were replaced by
their dictionary translations. They call the preparation of the synthetic data
self-confidently code switching (therefore the paper title), but in real code
switching, the

To finish reading, please visit source site

Leave a Reply