The Bahdanau Attention Mechanism
Conventional encoder-decoder architectures for machine translation encoded every source sentence into a fixed-length vector, regardless of its length, from which the decoder would then generate a translation. This made it difficult for the neural network to cope with long sentences, essentially resulting in a performance bottleneck.
The Bahdanau attention was proposed to address the performance bottleneck of conventional encoder-decoder architectures, achieving significant improvements over the conventional approach.
In this tutorial, you will discover the Bahdanau attention mechanism for neural machine translation.
After completing this tutorial, you will know:
- Where the Bahdanau attention derives its name from and the challenge it addresses
- The role of the different components that form part of the Bahdanau