Machine Translation Weekly 100: IGLUE as cool as igloo, multilingual and multimodal benchmark

This week I would like to feature a new multimodal-multilingual benchmark called IGLUE, presented in a pre-print that went out last Friday. The authors are from many place around the world: University of Copenhagen, Mila – Quebec Artificial Intelligence Institute, University of Cambridge, TU Darmstadt, New York University, and McGill University. Following the best practices from established multilingual benchmarks, the new multimodal and multilingual benchmark evaluates zero-shot cross-lingual transfer with the multimodal tasks. Zero-shot cross-lingual transfer means a task-specific model […]

Read more

Machine Translation Weekly 99: Vícejazyčné jazykové modely občas také můžou mít problémy

Vícejazyčné jazykové modely a technologie, které na jejich základě vznikají pomáhají zásadní mírou zpřístupňovat nástroje, které až donedávna byly dostupné pouze mluvčím velkých jazyků v bohatší části planety. Umožňují (do jisté míry) jednotně reprezentovat text v různých jazycích. Modely strojového učení trénované v jednom jazyce potom fungují i v ostatních jazycích, pro které nemáme buď žádná trénovací data nebo jen velmi málo dat. Předtrénované vícejazyčné jazykové modely také výrazně zvyšují kvalitu strojového překladu mezi jazyky, kde není k dispozici dostatek […]

Read more

Machine Translation Weekly 99: Multilingual models can also be evil

In a report published in December on arXiv, Google Deepmind tries to categorize major ethical and societal issues connected to large language models. The report probably does not say anything that was not known before, but I like the way they categorize the issues they talk about. Because the report mostly talks about monolingual language models, in this post, I will go over some of the issues they discuss and speculate how they in the paper are relevant for machine […]

Read more

Machine Translation Weekly 98: XGLM: GPT-3 for 30 languages

By the end of the year, Meta AI (previously Facebook AI) published a pre-print introducing a multilingual version of GPT-3 called XGLM. As its title – Few-shot Learning with Multilingual Language Models – suggests, it explores the few-shot learning capabilities. The main takeaways are: Good news: It is indeed possible to train such a model and it works somehow. Bad news 1: Cross-lingual transfer of few-shot learned tasks is not as good as I would expect. Bad news 2: Huge […]

Read more

Machine Translation Weekly 97: Multilingual and Non-autoregressive MT at the same time

Multilingual machine translation models look very promising, especially for low-resource languages that can benefit from similar patterns in similar languages. A new preprint with authors from the University of Maryland and Google Research studies how these results transfer to non-autoregressive machine translation models. The title of the paper is Can Multilinguality benefit Non-autoregressive Machine Translation?. Spoiler: it is not as good as it might seem. The paper tries to answer two questions: First, is it better to use a multilingual […]

Read more

Machine Translation Weekly 96: On Evaluation of Non-Autoregressive MT Systems

I often review papers on non-autoregressive machine translation a tend the repeat the same things in my reviews. The papers often compare non-comparable things to show the non-autoregressive models in a better light. Apart from the usual flaws in MT evaluation, non-autoregressive papers often (with honorable exceptions) get lost in the knowledge distillation setup. In general, papers tend to motivate non-autoregressive MT by potential speedup. Although it is an important motivation, it is not the main motivation for me. By […]

Read more

Machine Translation Weekly 95: Minimum Bayes Risk Decoding – the Cooler the Metric, the Cooler it gets

This week I am returning to a topic that I follow with fascination (cf. MT Weekly #20, #61, #63, and #66) without actually doing any research myself – decoding in machine learning models. The preprint I will discuss today comes from Google Research and has the title Minimum Bayes Risk Decoding with Neural Metrics of Translation Quality. It shows that Minimum Bayes Risk (MBR) decoding can outperform beam search when done properly and that there might be some serious problems […]

Read more

Machine Translation Weekly 94: Notes from WMT 2021

After the notes from EMNLP 2021, here is also an unsorted list of some observations from the Conference on Machine Translation. Facebook AI won in many translation directions (not at all in all of them) in the news task with a multilingual system. At the panel discussion about MT evaluation, Herman Nay expressed a controversial opinion: it does not matter what metric we use, the history of MT would be the same with any metric (that at least slightly correlates […]

Read more

Machine Translation Weekly 93: Notes from EMNLP 2021

Another big NLP conference is over and here are my notes about the paper that I liked the most. My general impression was sort of similar to what I got from ACL this year. It seems to me that the field is progressing towards some behavioral understanding of what the neural models do, which allows doing some cool tricks that it was hardly possible to think of, only a few years ago. Excellent examples are tricks with adapters or non-parametric […]

Read more

Machine Translation Weekly 92: Multilingual Machine Translation with Plug-and-Play Embeddings

Deep learning models are prone to so-called catastrophic forgetting when finetuned on slightly different data than they were originally trained on. Often, they also badly generalize when confronted with data that do not exactly look like those they were trained on. On the other hand, there are more and more tricks on how to just reuse a part of a model that was trained for something else and it just works. I could hardly believe when Tom Kocmi and Ondřej […]

Read more
1 3 4 5 6 7 11