Implementation Patterns for the Encoder-Decoder RNN Architecture with Attention

Last Updated on August 14, 2019

The encoder-decoder architecture for recurrent neural networks is proving to be powerful on a host of sequence-to-sequence prediction problems in the field of natural language processing.

Attention is a mechanism that addresses a limitation of the encoder-decoder architecture on long sequences, and that in general speeds up the learning and lifts the skill of the model on sequence-to-sequence prediction problems.

In this post, you will discover patterns for implementing the encoder-decoder model with and without attention.

After reading this post, you will know:

The direct versus the recursive implementation pattern for the encoder-decoder recurrent neural network.
How attention fits into the direct implementation pattern for the encoder-decoder model.
How attention can be implemented with the recursive implementation pattern for the encoder-decoder model.

Kick-start your project with my new book Long Short-Term Memory Networks With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Implementation Patterns for the Encoder-Decoder RNN Architecture with Attention
Photo by To finish reading, please visit source site

Long Short-Term Memory Networks