Issue #40 – Consistency by Agreement in Zero-shot Neural MT
06 Jun19
Issue #40 – Consistency by Agreement in Zero-shot Neural MT
Author: Raj Patel, Machine Translation Scientist @ Iconic
In two of our earlier posts (Issues #6 and #37), we discussed the zero-shot approach to Neural MT – learning to translate from source to target without seeing even a single example of the language pair directly. In Neural MT, the zero-shot training is achieved using multilingual architecture (Johnson et al. 2017) – a single NMT engine that can translate between multiple languages. The multilingual neural model is trained for several language directions by concatenating the parallel sentences of various language pairs.
In this post, we focus on the generalisation issue of zero-shot neural MT and discuss a new training method proposed by Al-Shedivat and Parikh (2019).
Zero-shot consistency
A neural MT engine is said to be ‘zero-shot consistent’ if low error on supervised tasks implies low error on zero-shot tasks i.e. the system generalises. In general, it is better to have a translation system that exhibits zero-shot generalisation as the access to the parallel data is always limited and training is computationally expensive.
To achieve zero-shot consistency in Neural MT, Al-Shedivat and Parikh
To finish reading, please visit source site