Articles About Machine Learning

The Transformer Positional Encoding Layer in Keras, Part 2

In part 1, a gentle introduction to positional encoding in transformer models, we discussed the positional encoding layer of the transformer model. We also showed how you could implement this layer and its functions yourself in Python. In this tutorial, you’ll implement the positional encoding layer in Keras and Tensorflow. You can then use this layer in a complete transformer model. After completing this tutorial, you will know: Text vectorization in Keras Embedding layer in Keras How to subclass the […]

Read more

How to Implement Scaled Dot-Product Attention from Scratch in TensorFlow and Keras

Having familiarized ourselves with the theory behind the Transformer model and its attention mechanism, we’ll start our journey of implementing a complete Transformer model by first seeing how to implement the scaled-dot product attention. The scaled dot-product attention is an integral part of the multi-head attention, which, in turn, is an important component of both the Transformer encoder and decoder. Our end goal will be to apply the complete Transformer model to Natural Language Processing (NLP). In this tutorial, you […]

Read more

How to Implement Multi-Head Attention from Scratch in TensorFlow and Keras

We have already familiarized ourselves with the theory behind the Transformer model and its attention mechanism. We have already started our journey of implementing a complete model by seeing how to implement the scaled-dot product attention. We shall now progress one step further into our journey by encapsulating the scaled-dot product attention into a multi-head attention mechanism, which is a core component. Our end goal remains to apply the complete model to Natural Language Processing (NLP). In this tutorial, you […]

Read more

The Vision Transformer Model

With the Transformer architecture revolutionizing the implementation of attention, and achieving very promising results in the natural language processing domain, it was only a matter of time before we could see its application in the computer vision domain too. This was eventually achieved with the implementation of the Vision Transformer (ViT).  In this tutorial, you will discover the architecture of the Vision Transformer model, and its application to the task of image classification. After completing this tutorial, you will know: […]

Read more

Implementing the Transformer Encoder from Scratch in TensorFlow and Keras

Having seen how to implement the scaled dot-product attention and integrate it within the multi-head attention of the Transformer model, let’s progress one step further toward implementing a complete Transformer model by applying its encoder. Our end goal remains to apply the complete model to Natural Language Processing (NLP). In this tutorial, you will discover how to implement the Transformer encoder from scratch in TensorFlow and Keras.  After completing this tutorial, you will know: The layers that form part of the […]

Read more

Implementing the Transformer Decoder from Scratch in TensorFlow and Keras

There are many similarities between the Transformer encoder and decoder, such as their implementation of multi-head attention, layer normalization, and a fully connected feed-forward network as their final sub-layer. Having implemented the Transformer encoder, we will now go ahead and apply our knowledge in implementing the Transformer decoder as a further step toward implementing the complete Transformer model. Your end goal remains to apply the complete model to Natural Language Processing (NLP). In this tutorial, you will discover how to […]

Read more

Joining the Transformer Encoder and Decoder Plus Masking

We have arrived at a point where we have implemented and tested the Transformer encoder and decoder separately, and we may now join the two together into a complete model. We will also see how to create padding and look-ahead masks by which we will suppress the input values that will not be considered in the encoder or decoder computations. Our end goal remains to apply the complete model to Natural Language Processing (NLP). In this tutorial, you will discover […]

Read more

Training the Transformer Model

We have put together the complete Transformer model, and now we are ready to train it for neural machine translation. We shall use a training dataset for this purpose, which contains short English and German sentence pairs. We will also revisit the role of masking in computing the accuracy and loss metrics during the training process.  In this tutorial, you will discover how to train the Transformer model for neural machine translation.  After completing this tutorial, you will know: How […]

Read more

Plotting the Training and Validation Loss Curves for the Transformer Model

from tensorflow.keras.optimizers import Adam from tensorflow.keras.optimizers.schedules import LearningRateSchedule from tensorflow.keras.metrics import Mean from tensorflow import data, train, math, reduce_sum, cast, equal, argmax, float32, GradientTape, function from keras.losses import sparse_categorical_crossentropy from model import TransformerModel from prepare_dataset import PrepareDataset from

Read more

Inferencing the Transformer Model

We have seen how to train the Transformer model on a dataset of English and German sentence pairs and how to plot the training and validation loss curves to diagnose the model’s learning performance and decide at which epoch to run inference on the trained model. We are now ready to run inference on the trained Transformer model to translate an input sentence. In this tutorial, you will discover how to run inference on the trained Transformer model for neural […]

Read more
1 28 29 30 31 32 226