Machine Translation Weekly 66: Means against ends of sentences

This week I am going to revisit the mystery of decoding in neural machine
translation for one more time. It has been more than a year ago when Felix
Stahlberg and Bill Byrne discovered the very disturbing feature of neural
machine translation models
– that
the most probable target sentence is an empty sequence and this it is a sort of
luck that we decode good translations from the models (MT Weekly
20
). The paper disproved
the narrative of NMT being a relatively accurately trained model that knows
well how the most probable target sentence looks like, but we only have an
approximative algorithm that can get us a highly probable, but not the most
probable target sentence.

Recently, two papers reacted to this finding: one
suggested

 

 

To finish reading, please visit source site

Leave a Reply