Highlights from Machine Translation and Multilinguality in February 2024

With a new month, here are a few papers that I noticed on arXiv in February.

Linear-time Minimum Bayes Risk Decoding with Reference Aggregation

A preprint from the University of Zurich proposes a linear time version of
Minimum Bayes Risk (MBR) decoding in machine translation. This decoding
algorithm does not aim to generate the most probable sequence given the model
but the most typical one. This is typically done by sampling dozens of
candidate output sentences, from which we select the one that is most similar
to other sentences. This requires quadratically many comparisons. Moreover, the
best results are achieved with trained similarity metrics (such as COMET),
which are slow to compute. The preprint suggests a linear-time version of the
algorithm: instead of comparing all pairs of output

 

 

To finish reading, please visit source site

Leave a Reply