Highlights from Machine Translation and Multilinguality in March 2024

Did Translation Models Get More Robust Without Anyone Even Noticing?
Folks from Lisbon study how robust the newest MT systems are against source-side noise. Machine translation using large models, including translation-specific NLLB or via LLMs (such as Tower or GPT-3.5), is much more robust both towards synthetic noise (the nice feature of synthetic noise is that you can check the translation quality for different noise levels) and also real-world noisy data from social networks.
Tracing the Roots of Facts in Multilingual Language Models: Independent, Shared, and Transferred Knowledge
In a recent EACL paper, folks from the University of Tokyo analyze the consistency of factual knowledge in mBERT and XLM-R. They used the mLAMA dataset ( machine-translate LAMA dataset, which is basically a knowledge graph formulated