A Gentle Introduction to Bayes Theorem for Machine Learning

Last Updated on December 4, 2019 Bayes Theorem provides a principled way for calculating a conditional probability. It is a deceptively simple calculation, although it can be used to easily calculate the conditional probability of events where intuition often fails. Although it is a powerful tool in the field of probability, Bayes Theorem is also widely used in the field of machine learning. Including its use in a probability framework for fitting a model to a training dataset, referred to […]

September 29, 2020 Machine Learning

How to Develop a Naive Bayes Classifier from Scratch in Python

Last Updated on January 10, 2020 Classification is a predictive modeling problem that involves assigning a label to a given input data sample. The problem of classification predictive modeling can be framed as calculating the conditional probability of a class label given a data sample. Bayes Theorem provides a principled way for calculating this conditional probability, although in practice requires an enormous number of samples (very large-sized dataset) and is computationally expensive. Instead, the calculation of Bayes Theorem can be […]

September 29, 2020 Machine Learning

How to Implement Bayesian Optimization from Scratch in Python

Last Updated on August 22, 2020 In this tutorial, you will discover how to implement the Bayesian Optimization algorithm for complex optimization problems. Global optimization is a challenging problem of finding an input that results in the minimum or maximum cost of a given objective function. Typically, the form of the objective function is complex and intractable to analyze and is often non-convex, nonlinear, high dimension, noisy, and computationally expensive to evaluate. Bayesian Optimization provides a principled technique based on […]

September 29, 2020 Machine Learning

A Gentle Introduction to Bayesian Belief Networks

Probabilistic models can define relationships between variables and be used to calculate probabilities. For example, fully conditional models may require an enormous amount of data to cover all possible cases, and probabilities may be intractable to calculate in practice. Simplifying assumptions such as the conditional independence of all random variables can be effective, such as in the case of Naive Bayes, although it is a drastically simplifying step. An alternative is to develop a model that preserves known conditional dependence […]

September 29, 2020 Machine Learning

A Gentle Introduction to Information Entropy

Last Updated on July 13, 2020 Information theory is a subfield of mathematics concerned with transmitting data across a noisy channel. A cornerstone of information theory is the idea of quantifying how much information there is in a message. More generally, this can be used to quantify the information in an event and a random variable, called entropy, and is calculated using probability. Calculating information and entropy is a useful tool in machine learning and is used as the basis […]

September 29, 2020 Machine Learning

Information Gain and Mutual Information for Machine Learning

Last Updated on August 28, 2020 Information gain calculates the reduction in entropy or surprise from transforming a dataset in some way. It is commonly used in the construction of decision trees from a training dataset, by evaluating the information gain for each variable, and selecting the variable that maximizes the information gain, which in turn minimizes the entropy and best splits the dataset into groups for effective classification. Information gain can also be used for feature selection, by evaluating […]

September 29, 2020 Machine Learning

How to Calculate the KL Divergence for Machine Learning

Last Updated on November 1, 2019 It is often desirable to quantify the difference between probability distributions for a given random variable. This occurs frequently in machine learning, when we may be interested in calculating the difference between an actual and observed probability distribution. This can be achieved using techniques from information theory, such as the Kullback-Leibler Divergence (KL divergence), or relative entropy, and the Jensen-Shannon Divergence that provides a normalized and symmetrical version of the KL divergence. These scoring […]

September 29, 2020 Machine Learning

Naive Bayes Classifier From Scratch in Python

Last Updated on October 25, 2019 In this tutorial you are going to learn about the Naive Bayes algorithm including how it works and how to implement it from scratch in Python (without libraries). We can use probability to make predictions in machine learning. Perhaps the most widely used example is called the Naive Bayes algorithm. Not only is it straightforward to understand, but it also achieves surprisingly good results on a wide range of problems. After completing this tutorial […]

September 29, 2020 Machine Learning

A Gentle Introduction to Cross-Entropy for Machine Learning

Last Updated on December 20, 2019 Cross-entropy is commonly used in machine learning as a loss function. Cross-entropy is a measure from the field of information theory, building upon entropy and generally calculating the difference between two probability distributions. It is closely related to but is different from KL divergence that calculates the relative entropy between two probability distributions, whereas cross-entropy can be thought to calculate the total entropy between the distributions. Cross-entropy is also related to and often confused […]

September 29, 2020 Machine Learning

A Gentle Introduction to Maximum Likelihood Estimation for Machine Learning

Last Updated on November 5, 2019 Density estimation is the problem of estimating the probability distribution for a sample of observations from a problem domain. There are many techniques for solving density estimation, although a common framework used throughout the field of machine learning is maximum likelihood estimation. Maximum likelihood estimation involves defining a likelihood function for calculating the conditional probability of observing the data sample given a probability distribution and distribution parameters. This approach can be used to search […]

« 1 … 860 861 862 863 864 … 928 »