Machine Translation Weekly 52: Human Parity in Machine Translation

This week I am going to have a look at a paper by my former colleagues from
Prague “Transforming machine translation: a deep learning system reaches news
translation quality comparable to human
professionals” that was
published in Nature Communications. The paper systematically studies machine
translation quality compared to human translation quality with the main
criterion being the human judgment about the translations.
Already in 2016, Google announced almost
reaching human parity on their internal test sets. However, these results were
achieved on proprietary tests only and public evaluation campaigns at WMT did
not really confirm the results. A slightly bigger surprise was that in WMT18,
the annual competition in machine translation quality, the Czech-English system
from Charles University scored better than human translations made by a
professional translation agency. This week’s paper tries to understand this
highly suspicious results and tell if it is an artifact of how