Machine Translation Weekly 92: Multilingual Machine Translation with Plug-and-Play Embeddings

Deep learning models are prone to so-called catastrophic forgetting when
finetuned on slightly different data than they were originally trained on.
Often, they also badly generalize when confronted with data that do not exactly
look like those they were trained on. On the other hand, there are more and
more tricks on how to just reuse a part of a model that was trained for
something else and it just works. I could hardly believe when Tom Kocmi and
Ondřej Bojar
took some trained models, used
them as initialization for totally unrelated languages and it worked much
better than initializing the models randomly. Recently, a new similar “recipe
paper” appeared on arXiv, advising to re-trained just word embedding and keep
the other weights frozen. The title of the paper

 

 

To finish reading, please visit source site

Leave a Reply