Implementation Patterns for the Encoder-Decoder RNN Architecture with Attention
Last Updated on August 14, 2019
The encoder-decoder architecture for recurrent neural networks is proving to be powerful on a host of sequence-to-sequence prediction problems in the field of natural language processing.
Attention is a mechanism that addresses a limitation of the encoder-decoder architecture on long sequences, and that in general speeds up the learning and lifts the skill of the model on sequence-to-sequence prediction problems.
In this post, you will discover patterns for implementing the encoder-decoder model with and without attention.
After reading this post, you will know:
- The direct versus the recursive implementation pattern for the encoder-decoder recurrent neural network.
- How attention fits into the direct implementation pattern for the encoder-decoder model.
- How attention can be implemented with the recursive implementation pattern for the encoder-decoder model.
Kick-start your project with my new book Long Short-Term Memory Networks With Python, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.