10 Examples of How to Use Statistical Methods in a Machine Learning Project

Last Updated on August 8, 2019 Statistics and machine learning are two very closely related fields. In fact, the line between the two can be very fuzzy at times. Nevertheless, there are methods that clearly belong to the field of statistics that are not only useful, but invaluable when working on a machine learning project. It would be fair to say that statistical methods are required to effectively work through a machine learning predictive modeling project. In this post, you […]

Read more

What is Statistics (and why is it important in machine learning)?

Last Updated on August 8, 2019 Statistics is a collection of tools that you can use to get answers to important questions about data. You can use descriptive statistical methods to transform raw observations into information that you can understand and share. You can use inferential statistical methods to reason from small samples of data to whole domains. In this post, you will discover clearly why statistics is important in general and for machine learning and generally the types of […]

Read more

The Close Relationship Between Applied Statistics and Machine Learning

Last Updated on August 8, 2019 The machine learning practitioner has a tradition of algorithms and a pragmatic focus on results and model skill above other concerns such as model interpretability. Statisticians work on much the same type of modeling problems under the names of applied statistics and statistical learning. Coming from a mathematical background, they have more of a focus on the behavior of models and explainability of predictions. The very close relationship between the two approaches to the […]

Read more

Statistics for Evaluating Machine Learning Models

Last Updated on August 14, 2020 Tom Mitchell’s classic 1997 book “Machine Learning” provides a chapter dedicated to statistical methods for evaluating machine learning models. Statistics provides an important set of tools used at each step of a machine learning project. A practitioner cannot effectively evaluate the skill of a machine learning model without using statistical methods. Unfortunately, statistics is an area that is foreign to most developers and computer science graduates. This makes the chapter in Mitchell’s seminal machine […]

Read more

How to Generate Random Numbers in Python

Last Updated on September 4, 2020 The use of randomness is an important part of the configuration and evaluation of machine learning algorithms. From the random initialization of weights in an artificial neural network, to the splitting of data into random train and test sets, to the random shuffling of a training dataset in stochastic gradient descent, generating random numbers and harnessing randomness is a required skill. In this tutorial, you will discover how to generate and work with random […]

Read more

Statistics in Plain English for Machine Learning

Last Updated on August 8, 2019 There is an ocean of books on statistics; where do you start? A big problem in choosing a beginner book on statistics is that a book may suffer one of two common problems. It may be a mathematical textbook filled with derivations, special cases, and proofs for each statistical method with little idea for the intuition for the method or how to use it. Or it may be a playbook for a proprietary or […]

Read more

How to Calculate Nonparametric Rank Correlation in Python

Last Updated on August 8, 2019 Correlation is a measure of the association between two variables. It is easy to calculate and interpret when both variables have a well understood Gaussian distribution. When we do not know the distribution of the variables, we must use nonparametric rank correlation methods. In this tutorial, you will discover rank correlation methods for quantifying the association between variables with a non-Gaussian distribution. After completing this tutorial, you will know: How rank correlation methods work […]

Read more

A Gentle Introduction to Effect Size Measures in Python

Last Updated on August 8, 2019 Statistical hypothesis tests report on the likelihood of the observed results given an assumption, such as no association between variables or no difference between groups. Hypothesis tests do not comment on the size of the effect if the association or difference is statistically significant. This highlights the need for standard ways of calculating and reporting a result. Effect size methods refer to a suite of statistical tools from the the field of estimation statistics […]

Read more

A Gentle Introduction to Statistical Power and Power Analysis in Python

Last Updated on April 24, 2020 The statistical power of a hypothesis test is the probability of detecting an effect, if there is a true effect present to detect. Power can be calculated and reported for a completed experiment to comment on the confidence one might have in the conclusions drawn from the results of the study. It can also be used as a tool to estimate the number of observations or sample size required in order to detect an […]

Read more

All of Statistics for Machine Learning

Last Updated on August 8, 2019 A foundation in statistics is required to be effective as a machine learning practitioner. The book “All of Statistics” was written specifically to provide a foundation in probability and statistics for computer science undergraduates that may have an interest in data mining and machine learning. As such, it is often recommended as a book to machine learning practitioners interested in expanding their understanding of statistics. In this post, you will discover the book “All […]

Read more
1 816 817 818 819 820 905