Machine Translation Weekly 46: The News GPT-3 has for Machine Translation
Back in 2013, a friend of mine enthusiastically told me, how excited he was
about deep learning democratizing AI (and way saying it was not relevant for
NLP at all): there was no need for large CPU clusters, all you needed was
buying a gaming PC and start training models and publishing ground-breaking
papers. Now, it is 2020 and there is GPT-3…
Some weeks ago OpenAI published a pre-print about their giant language
model that they call GPT-3. It was
trained on 300 billion words, it has 175 billion parameters and it is probably
the biggest artificial neural network ever trained.
This sounds cool, but there is nothing really innovative: they used the
standard Transformer architecture and a lot of data, more data than anyone
before, and that is it. Experiments at this scale are unavailable for most
research groups in the world, so the only thing that we