Issue #127 – On the Sparsity of Neural MT Models

22 Apr21 Issue #127 – On the Sparsity of Neural MT Models Author: Dr. Jingyi Han, Machine Translation Scientist @ Iconic Introduction Looking at the evolution of Neural Machine Translation (NMT), from a simple feed-forward approach to the recent state of the art Transformer architecture, models are getting more and more complicated by involving a large number of parameters to fit a massive data well. As a consequence, over-parameterization is a common problem suffered by NMT models, and it is […]

Read more

Issue #95 – Constrained Parameter Initialisation for Deep Transformers in Neural MT

20 Aug20 Issue #95 – Constrained Parameter Initialisation for Deep Transformers in Neural MT Author: Dr. Patrik Lambert, Senior Machine Translation Scientist @ Iconic Introduction As the Transformer model is the state of the art in Neural MT, researchers have tried to build wider (with higher dimension vectors) and deeper (with more layers) Transformer networks. Wider networks are more costly in terms of training and generation time, thus they are not the best option in production environments. However, adding encoder […]

Read more