Loss Functions in TensorFlow

The loss metric is very important for neural networks. As all machine learning models are one optimization problem or another, the loss is the objective function to minimize. In neural networks, the optimization is done with gradient descent and backpropagation. But what are loss functions, and how are they affecting your neural networks? In this post, you will learn what loss functions are and delve into some commonly used loss functions and how you can apply them to your neural […]

Read more

Image Augmentation with Keras Preprocessing Layers and tf.image

When you work on a machine learning problem related to images, not only do you need to collect some images as training data, but you also need to employ augmentation to create variations in the image. It is especially true for more complex object recognition problems. There are many ways for image augmentation. You may use some external libraries or write your own functions for that. There are some modules in TensorFlow and Keras for augmentation too. In this post, […]

Read more

Using Depthwise Separable Convolutions in Tensorflow

Looking at all of the very large convolutional neural networks such as ResNets, VGGs, and the like, it begs the question on how we can make all of these networks smaller with less parameters while still maintaining the same level of accuracy or even improving generalization of the model using a smaller amount of parameters. One approach is depthwise separable convolutions, also known by separable convolutions in TensorFlow and Pytorch (not to be confused with spatially separable convolutions which are […]

Read more

A Bird’s Eye View of Research on Attention

Attention is a concept that is scientifically studied across multiple disciplines, including psychology, neuroscience, and, more recently, machine learning. While all disciplines may have produced their own definitions for attention, one core quality they can all agree on is that attention is a mechanism for making both biological and artificial neural systems more flexible.  In this tutorial, you will discover an overview of the research advances on attention.  After completing this tutorial, you will know: The concept of attention that […]

Read more

What Is Attention?

Attention is becoming increasingly popular in machine learning, but what makes it such an attractive concept? What is the relationship between attention applied in artificial neural networks and its biological counterpart? What components would one expect to form an attention-based system in machine learning? In this tutorial, you will discover an overview of attention and its application in machine learning. After completing this tutorial, you will know: A brief overview of how attention can manifest itself in the human brain […]

Read more

The Attention Mechanism from Scratch

The attention mechanism was introduced to improve the performance of the encoder-decoder model for machine translation. The idea behind the attention mechanism was to permit the decoder to utilize the most relevant parts of the input sequence in a flexible manner, by a weighted combination of all the encoded input vectors, with the most relevant vectors being attributed the highest weights.  In this tutorial, you will discover the attention mechanism and its implementation.  After completing this tutorial, you will know: […]

Read more

A Tour of Attention-Based Architectures

As the popularity of attention in machine learning grows, so does the list of neural architectures that incorporate an attention mechanism. In this tutorial, you will discover the salient neural architectures that have been used in conjunction with attention. After completing this tutorial, you will better understand how the attention mechanism is incorporated into different neural architectures and for which purpose.  Kick-start your project with my book Building Transformer Models with Attention. It provides self-study tutorials with working code to […]

Read more

Adding a Custom Attention Layer to a Recurrent Neural Network in Keras

Deep learning networks have gained immense popularity in the past few years. The “attention mechanism” is integrated with deep learning networks to improve their performance. Adding an attention component to the network has shown significant improvement in tasks such as machine translation, image recognition, text summarization, and similar applications. This tutorial shows how to add a custom attention layer to a network built using a recurrent neural network. We’ll illustrate an end-to-end application of time series forecasting using a very […]

Read more

The Bahdanau Attention Mechanism

Conventional encoder-decoder architectures for machine translation encoded every source sentence into a fixed-length vector, regardless of its length, from which the decoder would then generate a translation. This made it difficult for the neural network to cope with long sentences, essentially resulting in a performance bottleneck.  The Bahdanau attention was proposed to address the performance bottleneck of conventional encoder-decoder architectures, achieving significant improvements over the conventional approach.  In this tutorial, you will discover the Bahdanau attention mechanism for neural machine […]

Read more

The Luong Attention Mechanism

The Luong attention sought to introduce several improvements over the Bahdanau model for neural machine translation, notably by introducing two new classes of attentional mechanisms: a global approach that attends to all source words and a local approach that only attends to a selected subset of words in predicting the target sentence.  In this tutorial, you will discover the Luong attention mechanism for neural machine translation.  After completing this tutorial, you will know: The operations performed by the Luong attention […]

Read more
1 74 75 76 77 78 920