Issue #95 – Constrained Parameter Initialisation for Deep Transformers in Neural MT

20 Aug20 Issue #95 – Constrained Parameter Initialisation for Deep Transformers in Neural MT Author: Dr. Patrik Lambert, Senior Machine Translation Scientist @ Iconic Introduction As the Transformer model is the state of the art in Neural MT, researchers have tried to build wider (with higher dimension vectors) and deeper (with more layers) Transformer networks. Wider networks are more costly in terms of training and generation time, thus they are not the best option in production environments. However, adding encoder […]