Issue #91 – Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation

23 Jul20

Issue #91 – Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation

Author: Dr. Chao-Hong Liu, Machine Translation Scientist @ Iconic

Introduction

Unsupervised Machine Translation (MT) is the technology that we use to train MT engines when parallel data is not used, at least not directly. We have discussed some interesting approaches in several previous posts for unsupervised MT (Issues #11 and #28) and some related topics (Issues #6, #25 and #66). Training MT engines requires the existence of parallel data, and these ideas are focused on how to do that with little or no parallel data. In this post, we review a two-step approach proposed by Pourdamghani et al. (2019), called translating “translationese”.

Translating with Translationese

The idea of the two-step approach is simple, whereby a dictionary is used to turn an input sentence into a “translationese”, a pseudo translation, and then an MT engine is trained with parallel data where the source side is also converted to translationese. For the dictionary, rather than using existing ones, Pourdamghani et al. (2019) decided to use automatically built dictionaries, following the approach proposed by To finish reading, please visit source site