Joining the Transformer Encoder and Decoder Plus Masking
We have arrived at a point where we have implemented and tested the Transformer encoder and decoder separately, and we may now join the two together into a complete model. We will also see how to create padding and look-ahead masks by which we will suppress the input values that will not be considered in the encoder or decoder computations. Our end goal remains to apply the complete model to Natural Language Processing (NLP).
In this tutorial, you will discover how to implement the complete Transformer model and create padding and look-ahead masks.
After completing this tutorial, you will know:
- How to create a padding mask for the encoder and decoder
- How to create a