Make Better Predictions with Boosting, Bagging and Blending Ensembles in Weka

Last Updated on August 22, 2019 Weka is the perfect platform for studying machine learning. It provides a graphical user interface for exploring and experimenting with machine learning algorithms on datasets, without you having to worry about the mathematics or the programming. In a previous post we looked at how to design and run an experiment running 3 algorithms on a dataset and how to analyse and report the results. We also looked at how to design and run an […]

Read more

4-Steps to Get Started in Applied Machine Learning

Last Updated on August 16, 2020 A Top-Down Strategy for Beginners to Start and Practice Machine Learning. Getting started is much easier than you think. In this post I show you the top-down approach for getting started in applied machine learning. You will discover the four steps to this approach. They should feel familiar because it’s probably the same top-down approach that you used to learn how to program. Namely, get the basics, practice a lot and dive into the […]

Read more

A Simple Intuition for Overfitting, or Why Testing on Training Data is a Bad Idea

Last Updated on August 21, 2016 When you first start out with machine learning you load a dataset and try models. You might think to yourself, why can’t I just build a model with all of the data and evaluate it on the same dataset? It seems reasonable. More data to train the model is better, right? Evaluating the model and reporting results on the same dataset will tell you how good the model is, right? Wrong. In this post […]

Read more

Template for Working through Machine Learning Problems in Weka

Last Updated on August 22, 2019 When you are getting started in Weka, you may feel overwhelmed. There are so many datasets, so many filters and so many algorithms to choose from. There is too much choice. There are too many things you could be doing. Too much ChoicePhoto by emilio labrador, some rights reserved. Structured process is key. I have talked about process and the need for tasks like spot checking algorithms to overcome the overwhelm and start learning […]

Read more

Biggest Mistake I Made When Starting Machine Learning, And How To Avoid It

Last Updated on August 22, 2019 When I first got started in machine learning I implemented algorithms by hand. It was really slow going. I was a terrible programmer at the time. I was trying to figure out the algorithms from books, how to use them on problems and how to write code – all at the same time. This was the biggest mistake I made when getting started. It made everything 3-times harder and killed my motivation. A friend […]

Read more

Feature Selection to Improve Accuracy and Decrease Training Time

Last Updated on August 16, 2020 Working on a problem, you are always looking to get the most out of the data that you have available. You want the best accuracy you can get. Typically, the biggest wins are in better understanding the problem you are solving. This is why I stress you spend so much time up front defining your problem, analyzing the data, and preparing datasets for your models. A key part of data preparation is creating transforms […]

Read more

Project Spotlight: Stack Exchange Clustering using Mahout with Konstantin Slisenko

Last Updated on August 16, 2020 This is a project spotlight with Konstantin Slisenko a programmer and machine learning enthusiast. Could you please introduce yourself? My name is Konstantin Slisenko, I’m from Belarus. I graduated from the Belarusian State University of Informatics and Radioelectronics. I am currently taking a master course. Konstantin Slisenko I’m a Java developer and work in JazzTeam company. I like to learn new technologies. I’m currently interested in big data and machine learning. I like to participate […]

Read more

Market Basket Analysis with Association Rule Learning

Last Updated on August 22, 2019 The promise of Data Mining was that algorithms would crunch data and find interesting patterns that you could exploit in your business. The exemplar of this promise is market basket analysis (Wikipedia calls it affinity analysis). Given a pile of transactional records, discover interesting purchasing patterns that could be exploited in the store, such as offers and product layout. In this post you will work through a market basket analysis tutorial using association rule learning […]

Read more

Project Spotlight: Event Recommendation in Python with Artem Yankov

Last Updated on June 7, 2016 This is a project spotlight with Artem Yankov. Could you please introduce yourself? My name is Artem Yankov, I have worked as a software engineer for Badgeville for the last 3 years. I’m using there Ruby and Scala although my prior background includes use of various languages such as: Assembly, C/C++, Python, Clojure and JS. I love hacking on small projects and exploring different fields, for instance two almost random fields I’ve looked at were […]

Read more

Classification Accuracy is Not Enough: More Performance Measures You Can Use

Last Updated on June 20, 2019 When you build a model for a classification problem you almost always want to look at the accuracy of that model as the number of correct predictions from all predictions made. This is the classification accuracy. In a previous post, we have looked at evaluating the robustness of a model for making predictions on unseen data using cross-validation and multiple cross-validation where we used classification accuracy and average classification accuracy. Once you have a […]

Read more
1 761 762 763 764 765 906