Implementing the Transformer Decoder from Scratch in TensorFlow and Keras

There are many similarities between the Transformer encoder and decoder, such as their implementation of multi-head attention, layer normalization, and a fully connected feed-forward network as their final sub-layer. Having implemented the Transformer encoder, we will now go ahead and apply our knowledge in implementing the Transformer decoder as a further step toward implementing the complete Transformer model. Your end goal remains to apply the complete model to Natural Language Processing (NLP). In this tutorial, you will discover how to […]

Read more

Joining the Transformer Encoder and Decoder Plus Masking

We have arrived at a point where we have implemented and tested the Transformer encoder and decoder separately, and we may now join the two together into a complete model. We will also see how to create padding and look-ahead masks by which we will suppress the input values that will not be considered in the encoder or decoder computations. Our end goal remains to apply the complete model to Natural Language Processing (NLP). In this tutorial, you will discover […]

Read more

Training the Transformer Model

We have put together the complete Transformer model, and now we are ready to train it for neural machine translation. We shall use a training dataset for this purpose, which contains short English and German sentence pairs. We will also revisit the role of masking in computing the accuracy and loss metrics during the training process.  In this tutorial, you will discover how to train the Transformer model for neural machine translation.  After completing this tutorial, you will know: How […]

Read more

Plotting the Training and Validation Loss Curves for the Transformer Model

from tensorflow.keras.optimizers import Adam from tensorflow.keras.optimizers.schedules import LearningRateSchedule from tensorflow.keras.metrics import Mean from tensorflow import data, train, math, reduce_sum, cast, equal, argmax, float32, GradientTape, function from keras.losses import sparse_categorical_crossentropy from model import TransformerModel from prepare_dataset import PrepareDataset from

Read more

Inferencing the Transformer Model

We have seen how to train the Transformer model on a dataset of English and German sentence pairs and how to plot the training and validation loss curves to diagnose the model’s learning performance and decide at which epoch to run inference on the trained model. We are now ready to run inference on the trained Transformer model to translate an input sentence. In this tutorial, you will discover how to run inference on the trained Transformer model for neural […]

Read more

A Brief Introduction to BERT

As we learned what a Transformer is and how we might train the Transformer model, we notice that it is a great tool to make a computer understand human language. However, the Transformer was originally designed as a model to translate one language to another. If we repurpose it for a different task, we would likely need to retrain the whole model from scratch. Given the time it takes to train a Transformer model is enormous, we would like to […]

Read more

One-Dimensional Tensors in Pytorch

PyTorch is an open-source deep learning framework based on Python language. It allows you to build, train, and deploy deep learning models, offering a lot of versatility and efficiency. PyTorch is primarily focused on tensor operations while a tensor can be a number, matrix, or a multi-dimensional array. In this tutorial, we will perform some basic operations on one-dimensional tensors as they are complex mathematical objects and an essential part of the PyTorch library. Therefore, before going into the detail […]

Read more

Two-Dimensional Tensors in Pytorch

Two-dimensional tensors are analogous to two-dimensional metrics. Like a two-dimensional metric, a two-dimensional tensor also has $n$ number of rows and columns. Let’s take a gray-scale image as an example, which is a two-dimensional matrix of numeric values, commonly known as pixels. Ranging from ‘0’ to ‘255’, each number represents a pixel intensity value. Here, the lowest intensity number (which is ‘0’) represents black regions in the image while the highest intensity number (which is ‘255’) represents white regions in […]

Read more

Calculating Derivatives in PyTorch

Derivatives are one of the most fundamental concepts in calculus. They describe how changes in the variable inputs affect the function outputs. The objective of this article is to provide a high-level introduction to calculating derivatives in PyTorch for those who are new to the framework. PyTorch offers a convenient way to calculate derivatives for user-defined functions. While we always have to deal with backpropagation (an algorithm known to be the backbone of a neural network) in neural networks, which […]

Read more

Using Dataset Classes in PyTorch

In machine learning and deep learning problems, a lot of effort goes into preparing the data. Data is usually messy and needs to be preprocessed before it can be used for training a model. If the data is not prepared correctly, the model won’t be able to generalize well.Some of the common steps required for data preprocessing include: Data normalization: This includes normalizing the data between a range of values in a dataset. Data augmentation: This includes generating new samples […]

Read more
1 76 77 78 79 80 920