Machine Translation Weekly 63: Maximum Aposteriori vs. Minimum Bayes Risk decoding

This week I will have a look at the best paper from this year’s COLING that
brings an interesting view on inference in NMT models. The title of the paper
is “Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine
Translation”
and
its authors are from the University of Amsterdam.

NMT models learn the conditional probability of the next word in a target
sentence given the source sentence and the previous words in the target
sentence. Using the chain rule, we can multiply those probabilities and thus
get the probability of the target sentence, given a source sentence. During
training, we optimize the models such that the target sentences from the
training data get as high probability as possible. At the inference time, we

 

 

To finish reading, please visit source site

Leave a Reply