Machine Translation Weekly 90: The Surprising Multinguality of Large Language Models
This week, I am going to share my amazement and doubts about what could be
called the surprising multilinguality of large language models. By large
language models, I mean the really large ones that I can hardly run myself,
trained on huge, hardly curated data and thus harbouring the worst societal
demons, but also having many fascinating properties. Here, I would like to
feature three papers that make me think about the properties of the models.
1. Finetuning to other
languages. This paper from
the University of Groningen published in the Findings of ACL 2021 shows a
simple way how the English GPT-2 model can be finetuned to other languages.
They do it in two steps: first, they create a vocabulary for the new language
and only learn the