Highlights from Machine Translation and Multilinguality in September 2022
Here are my monthly highlights from paper machine translation and
multilinguality.
A preprint from the Nara Institute of
Science and Technology shows that target-language-specific fully connected
layers in the Transformer decoder improve multilingual and zero-shot MT
compared to the current practice of using a special token to indicate what the
target language is. A very similar idea is also in a preprint from Tianjin
University, but in this case, they add
language-specific parameters for the other part of the Transformer decoder –
the self-attention sublayers. Of course, the improvement is reached at the
expense of a higher number of parameters and as it very often happens, the
pre-prints do not include baselines with the same numbers of parameters as
their improved models. It makes it hard to assess,