Statistics for Machine Learning (7-Day Mini-Course)

Last Updated on August 8, 2019 Statistics for Machine Learning Crash Course. Get on top of the statistics used in machine learning in 7 Days. Statistics is a field of mathematics that is universally agreed to be a prerequisite for a deeper understanding of machine learning. Although statistics is a large field with many esoteric theories and findings, the nuts and bolts tools and notations taken from the field are required for machine learning practitioners. With a solid foundation of […]

Read more

17 Statistical Hypothesis Tests in Python (Cheat Sheet)

Last Updated on November 28, 2019 Quick-reference guide to the 17 statistical hypothesis tests that you need inapplied machine learning, with sample code in Python. Although there are hundreds of statistical hypothesis tests that you could use, there is only a small subset that you may need to use in a machine learning project. In this post, you will discover a cheat sheet for the most popular statistical hypothesis tests for a machine learning project with examples using the Python […]

Read more

Arithmetic, Geometric, and Harmonic Means for Machine Learning

Last Updated on August 19, 2020 Calculating the average of a variable or a list of numbers is a common operation in machine learning. It is an operation you may use every day either directly, such as when summarizing data, or indirectly, such as a smaller step in a larger procedure when fitting a model. The average is a synonym for the mean, a number that represents the most likely value from a probability distribution. As such, there are multiple […]

Read more

A Gentle Introduction to Degrees of Freedom in Machine Learning

Last Updated on August 19, 2020 Degrees of freedom is an important concept from statistics and engineering. It is often employed to summarize the number of values used in the calculation of a statistic, such as a sample statistic or in a statistical hypothesis test. In machine learning, the degrees of freedom may refer to the number of parameters in the model, such as the number of coefficients in a linear regression model or the number of weights in a […]

Read more

Hypothesis Test for Comparing Machine Learning Algorithms

Last Updated on September 1, 2020 Machine learning models are chosen based on their mean performance, often calculated using k-fold cross-validation. The algorithm with the best mean performance is expected to be better than those algorithms with worse mean performance. But what if the difference in the mean performance is caused by a statistical fluke? The solution is to use a statistical hypothesis test to evaluate whether the difference in the mean performance between any two algorithms is real or […]

Read more
1 4 5 6