Attention in Long Short-Term Memory Recurrent Neural Networks

Last Updated on August 14, 2019 The Encoder-Decoder architecture is popular because it has demonstrated state-of-the-art results across a range of domains. A limitation of the architecture is that it encodes the input sequence to a fixed length internal representation. This imposes limits on the length of input sequences that can be reasonably learned and results in worse performance for very long input sequences. In this post, you will discover the attention mechanism for recurrent neural networks that seeks to […]

Read more

Gentle Introduction to the Adam Optimization Algorithm for Deep Learning

Last Updated on August 20, 2020 The choice of optimization algorithm for your deep learning model can mean the difference between good results in minutes, hours, and days. The Adam optimization algorithm is an extension to stochastic gradient descent that has recently seen broader adoption for deep learning applications in computer vision and natural language processing. In this post, you will get a gentle introduction to the Adam optimization algorithm for use in deep learning. After reading this post, you […]

Read more

A Tour of Recurrent Neural Network Algorithms for Deep Learning

Last Updated on August 14, 2019 Recurrent neural networks, or RNNs, are a type of artificial neural network that add additional weights to the network to create cycles in the network graph in an effort to maintain an internal state. The promise of adding state to neural networks is that they will be able to explicitly learn and exploit context in sequence prediction problems, such as problems with an order or temporal component. In this post, you are going take […]

Read more

How to Scale Data for Long Short-Term Memory Networks in Python

Last Updated on August 5, 2019 The data for your sequence prediction problem probably needs to be scaled when training a neural network, such as a Long Short-Term Memory recurrent neural network. When a network is fit on unscaled data that has a range of values (e.g. quantities in the 10s to 100s) it is possible for large inputs to slow down the learning and convergence of your network and in some cases prevent the network from effectively learning your […]

Read more

How to Remove Trends and Seasonality with a Difference Transform in Python

Last Updated on June 23, 2020 Time series datasets may contain trends and seasonality, which may need to be removed prior to modeling. Trends can result in a varying mean over time, whereas seasonality can result in a changing variance over time, both which define a time series as being non-stationary. Stationary datasets are those that have a stable mean and variance, and are in turn much easier to model. Differencing is a popular and widely used data transform for […]

Read more

How to One Hot Encode Sequence Data in Python

Last Updated on August 14, 2019 Machine learning algorithms cannot work with categorical data directly. Categorical data must be converted to numbers. This applies when you are working with a sequence classification type problem and plan on using deep learning methods such as Long Short-Term Memory recurrent neural networks. In this tutorial, you will discover how to convert your input or output sequence data to a one hot encoding for use in sequence classification problems with deep learning in Python. […]

Read more

What is the Difference Between Test and Validation Datasets?

Last Updated on August 14, 2020 A validation dataset is a sample of data held back from training your model that is used to give an estimate of model skill while tuning model’s hyperparameters. The validation dataset is different from the test dataset that is also held back from the training of the model, but is instead used to give an unbiased estimate of the skill of the final tuned model when comparing or selecting between final models. There is much […]

Read more

Gentle Introduction to Models for Sequence Prediction with RNNs

Last Updated on August 25, 2019 Sequence prediction is a problem that involves using historical sequence information to predict the next value or values in the sequence. The sequence may be symbols like letters in a sentence or real values like those in a time series of prices. Sequence prediction may be easiest to understand in the context of time series forecasting as the problem is already generally understood. In this post, you will discover the standard sequence prediction models […]

Read more

5 Examples of Simple Sequence Prediction Problems for LSTMs

Last Updated on August 14, 2019 Sequence prediction is different from traditional classification and regression problems. It requires that you take the order of observations into account and that you use models like Long Short-Term Memory (LSTM) recurrent neural networks that have memory and that can learn any temporal dependence between observations. It is critical to apply LSTMs to learn how to use them on sequence prediction problems, and for that, you need a suite of well-defined problems that allow […]

Read more

A Gentle Introduction to Mini-Batch Gradient Descent and How to Configure Batch Size

Last Updated on August 19, 2019 Stochastic gradient descent is the dominant method used to train deep learning models. There are three main variants of gradient descent and it can be confusing which one to use. In this post, you will discover the one type of gradient descent you should use in general and how to configure it. After completing this post, you will know: What gradient descent is and how it works from a high level. What batch, stochastic, […]

Read more
1 809 810 811 812 813 914