The Role of Randomization to Address Confounding Variables in Machine Learning

Last Updated on July 31, 2020 A large part of applied machine learning is about running controlled experiments to discover what algorithm or algorithm configuration to use on a predictive modeling problem. A challenge is that there are aspects of the problem and the algorithm called confounding variables that cannot be controlled (held constant) and must be controlled-for. An example is the use of randomness in a learning algorithm, such as random initialization or random choices during learning. The solution […]

Read more

Difference Between a Batch and an Epoch in a Neural Network

Last Updated on October 26, 2019 Stochastic gradient descent is a learning algorithm that has a number of hyperparameters. Two hyperparameters that often confuse beginners are the batch size and number of epochs. They are both integer values and seem to do the same thing. In this post, you will discover the difference between batches and epochs in stochastic gradient descent. After reading this post, you will know: Stochastic gradient descent is an iterative learning algorithm that uses a training […]

Read more

When to Use MLP, CNN, and RNN Neural Networks

Last Updated on August 19, 2019 What neural network is appropriate for your predictive modeling problem? It can be difficult for a beginner to the field of deep learning to know what type of network to use. There are so many types of networks to choose from and new methods being published and discussed every day. To make things worse, most neural networks are flexible enough that they work (make a prediction) even when used with the wrong type of […]

Read more

How to Calculate McNemar’s Test to Compare Two Machine Learning Classifiers

Last Updated on August 8, 2019 The choice of a statistical hypothesis test is a challenging open problem for interpreting machine learning results. In his widely cited 1998 paper, Thomas Dietterich recommended the McNemar’s test in those cases where it is expensive or impractical to train multiple copies of classifier models. This describes the current situation with deep learning models that are both very large and are trained and evaluated on large datasets, often requiring days or weeks to train […]

Read more

How to Configure the Number of Layers and Nodes in a Neural Network

Last Updated on August 6, 2019 Artificial neural networks have two main hyperparameters that control the architecture or topology of the network: the number of layers and the number of nodes in each hidden layer. You must specify values for these parameters when configuring your network. The most reliable way to configure these hyperparameters for your specific predictive modeling problem is via systematic experimentation with a robust test harness. This can be a tough pill to swallow for beginners to […]

Read more

How to Code the Student’s t-Test from Scratch in Python

Last Updated on August 8, 2019 Perhaps one of the most widely used statistical hypothesis tests is the Student’s t test. Because you may use this test yourself someday, it is important to have a deep understanding of how the test works. As a developer, this understanding is best achieved by implementing the hypothesis test yourself from scratch. In this tutorial, you will discover how to implement the Student’s t-test statistical hypothesis test from scratch in Python. After completing this […]

Read more

Why Initialize a Neural Network with Random Weights?

Last Updated on March 26, 2020 The weights of artificial neural networks must be initialized to small random numbers. This is because this is an expectation of the stochastic optimization algorithm used to train the model, called stochastic gradient descent. To understand this approach to problem solving, you must first understand the role of nondeterministic and randomized algorithms as well as the need for stochastic optimization algorithms to harness randomness in their search process. In this post, you will discover […]

Read more

Statistics for Machine Learning (7-Day Mini-Course)

Last Updated on August 8, 2019 Statistics for Machine Learning Crash Course. Get on top of the statistics used in machine learning in 7 Days. Statistics is a field of mathematics that is universally agreed to be a prerequisite for a deeper understanding of machine learning. Although statistics is a large field with many esoteric theories and findings, the nuts and bolts tools and notations taken from the field are required for machine learning practitioners. With a solid foundation of […]

Read more

11 Classical Time Series Forecasting Methods in Python (Cheat Sheet)

Last Updated on August 20, 2020 Machine learning methods can be used for classification and forecasting on time series problems. Before exploring machine learning methods for time series, it is a good idea to ensure you have exhausted classical linear time series forecasting methods. Classical time series forecasting methods may be focused on linear relationships, nevertheless, they are sophisticated and perform well on a wide range of problems, assuming that your data is suitably prepared and the method is well […]

Read more

Taxonomy of Time Series Forecasting Problems

Last Updated on August 5, 2019 When you are presented with a new time series forecasting problem, there are many things to consider. The choice that you make directly impacts each step of the project from the design of a test harness to evaluate forecast models to the fundamental difficulty of the forecast problem that you are working on. It is possible to very quickly narrow down the options by working through a series of questions about your time series […]

Read more
1 817 818 819 820 821 905