Highlights from Machine Translation and Multilinguality in May 2024
Here are short summaries of three pre-prints that I enjoyed reading in May. Zero-Shot Tokenizer Transfer Folks from the University of Cambridge and the Univerisity of Edinburgh propose a nice trick for changing the vocabulary of an already trained language model. They train a hyper-network (a neural network that predicts parameters of a different neural network) that predicts what embeddings a token would have if it were trained with the rest of the model. For each training batch, they build […]
Read more