Tuning Machine Learning Models Using the Caret R Package

Last Updated on August 22, 2019 Machine learning algorithms are parameterized so that they can be best adapted for a given problem. A difficulty is that configuring an algorithm for a given problem can be a project in and of itself. Like selecting ‘the best’ algorithm for a problem you cannot know before hand which algorithm parameters will be best for a problem. The best thing to do is to investigate empirically with controlled experiments. The caret R package was […]

Read more

Feature Selection with the Caret R Package

Last Updated on August 22, 2019 Selecting the right features in your data can mean the difference between mediocre performance with long training times and great performance with short training times. The caret R package provides tools to automatically report on the relevance and importance of attributes in your data and even select the most important features for you. In this post you will discover the feature selection tools in the Caret R package with standalone recipes in R. After […]

Read more

Compare Models And Select The Best Using The Caret R Package

Last Updated on December 13, 2019 The Caret R package allows you to easily construct many different model types and tune their parameters. After creating and tuning many model types, you may want know and select the best model so that you can use it to make predictions, perhaps in an operational environment. In this post you discover how to compare the results of multiple models using the caret R package. Kick-start your project with my new book Machine Learning […]

Read more

Discover Feature Engineering, How to Engineer Features and How to Get Good at It

Last Updated on August 15, 2020 Feature engineering is an informal topic, but one that is absolutely known and agreed to be key to success in applied machine learning. In creating this guide I went wide and deep and synthesized all of the material I could. You will discover what feature engineering is, what problem it solves, why it matters, how to engineer features, who is doing it well and where you can go to learn more and get good […]

Read more

A Data-Driven Approach to Choosing Machine Learning Algorithms

Last Updated on April 4, 2018 If You Knew Which Algorithm or Algorithm Configuration To Use,You Would Not Need To Use Machine Learning There is no best machine learning algorithm or algorithm parameters. I want to cure you of this type of silver bullet mindset. I see these questions a lot, even daily: Which is the best machine learning algorithm? What is the mapping between machine learning algorithms and problems? What are the best parameters for a machine learning algorithm? There […]

Read more

How to Build an Intuition for Machine Learning Algorithms

Last Updated on December 13, 2019 Machine learning algorithms are complex. To get good at applying a given algorithm you need to study it from multiple perspectives: algorithmic, mathematical and empirical. It’s this last point I want to stress. You need to build up an intuition or how an algorithm behaves on real data. You need to work on lots of problems. In this post I want to encourage you to use small in-memory datasets when starting out and when […]

Read more

Inteview: Discover the Methodology and Mindset of a Kaggle Master

Last Updated on July 5, 2019 What does it take to do well in competitive machine learning? To really dig into this question, you need to dig into the people that do well. In 2010 I participated in a Kaggle competition to predict the outcome of chess games in the future. It was a fascinating problem because it required you to model the rating of the players from historical games and propagating those ratings into the future to make predictions. […]

Read more

An Introduction to Feature Selection

Last Updated on August 15, 2020 Which features should you use to create a predictive model? This is a difficult question that may require deep knowledge of the problem domain. It is possible to automatically select those features in your data that are most useful or most relevant for the problem you are working on. This is a process called feature selection. In this post you will discover feature selection, the types of methods that you can use and a […]

Read more

Building a Production Machine Learning Infrastructure

Last Updated on June 7, 2016 Midwest.io is was a conference in Kansas City on July 14-15 2014. At the conference, Josh Wills gave a talk on what it takes to build production machine learning infrastructure in a talk titled “From the lab to the factory: Building a Production Machine Learning Infrastructure“. Josh Wills is a the Senior Director of Data Science at Cloudera and formally worked on Google’s ad auction system. In this post you will discover insight into […]

Read more

16 Options To Get Started and Make Progress in Machine Learning and Data Science

Last Updated on August 16, 2020 You want to learn machine learning or data science. You might want a job or the opportunity to get a job in machine learning or data science. Alternatively, you might be a student or in a data role and looking to accelerate your learning in the area. If you think your only options are to get a PhD or to read an academic textbook, think again. This post is for you. You have a […]

Read more
1 769 770 771 772 773 906