Machine Translation Weekly 98: XGLM: GPT-3 for 30 languages
By the end of the year, Meta AI (previously Facebook AI) published a pre-print
introducing a multilingual version of GPT-3 called XGLM. As its title –
Few-shot Learning with Multilingual Language
Models – suggests, it explores the few-shot
learning capabilities. The main takeaways are:
-
Good news: It is indeed possible to train such a model and it works somehow.
-
Bad news 1: Cross-lingual transfer of few-shot learned tasks is not as good
as I would expect. -
Bad news 2: Huge models are needed for reasonable performance.
-
Ambiguous news: In the few-shot setup, it is better to machine-translate
everything into English and proceed in English.
The model is in principle simple: It is an autoregressive language model