Why Aren’t My Results As Good As I Thought? You’re Probably Overfitting

Last Updated on August 15, 2020 We all know the satisfaction of running an analysis and seeing the results come back the way we want them to: 80% accuracy; 85%; 90%? The temptation is strong just to turn to the Results section of the report we’re writing, and put the numbers in. But wait: as always, it’s not that straightforward. Succumbing to this particular temptation could undermine the impact of otherwise completely valid analysis. With most machine learning algorithms it’s […]

Read more

Crash Course in Statistics for Machine Learning

Last Updated on August 15, 2020 You do not need to know statistics before you can start learning and applying machine learning. You can start today. Nevertheless, knowing some statistics can be very helpful to understand the language used in machine learning. Knowing some statistics will eventually be required when you want to start making strong claims about your results. In this post you will discover a few key concepts from statistics that will give you the confidence you need […]

Read more

How to Become a Data Scientist

Last Updated on April 19, 2018 How do you become a data scientist? I think that really depends on where you are now and what you really want to do as a data scientist. Nevertheless, DataCamp posted an infographic recently that described 8 easy steps to becoming a data scientist. In this post I want to highlight and review DataCamp’s infographic. How to become a data scientist A portion of the infographic posted on the DataCamp blog What is a […]

Read more

Data Management Matters And Why You Need To Take It Seriously

Last Updated on March 5, 2020 We live in a world drowning in data. Internet tracking, stock market movement, genome sequencing technologies and their ilk all produce enormous amounts of data. Most of this data is someone else’s responsibility, generated by someone else, stored in someone else’s database, which is maintained and made available by… you guessed it… someone else. But. Whenever we carry out a machine learning project we are working with a small subset of the all the […]

Read more

Understand Your Problem and Get Better Results Using Exploratory Data Analysis

Last Updated on August 15, 2020 You often jump from problem-to-problem in applied machine learning and you need to get up to speed on a new dataset, fast. A classical and under-utilised approach that you can use to quickly build a relationship with a new data problem is Exploratory Data Analysis. In this post you will discover Exploratory Data Analysis (EDA), the techniques and tactics that you can use and why you should be performing EDA on your next problem. […]

Read more

Evaluate Yourself As a Data Scientist

Last Updated on August 15, 2020 What skills do you need to be a data scientist? I read an interesting data-driven approach to answering this question in the book Doing Data Science: Straight Talk from the Frontline. In this post I summarize this self-assessment approach that you can use to evaluate your strengths as a data scientist and where you might fit into an amazing data science team. You can use applied machine learning practitioner as a synonym for data […]

Read more

Assessing and Comparing Classifier Performance with ROC Curves

Last Updated on March 5, 2020 The most commonly reported measure of classifier performance is accuracy: the percent of correct classifications obtained. This metric has the advantage of being easy to understand and makes comparison of the performance of different classifiers trivial, but it ignores many of the factors which should be taken into account when honestly assessing the performance of a classifier. What Is Meant By Classifier Performance? Classifier performance is more than just a count of correct classifications. […]

Read more

Get Your Dream Job in Machine Learning by Delivering Results

Last Updated on June 7, 2016 You can rise up and take on your desire to become an a machine learning practitioner and data scientist. You have to work hard, learn the skills and demonstrate that you can deliver results, but you don’t need a fancy degree or a fancy background. In this post I want to demonstrate that this is possible and even common. You will discover that top managers and CEOs are looking at results and not backgrounds […]

Read more

Lessons Learned from Building Machine Learning Systems

Last Updated on September 5, 2016 In a recent presentation at MLConf, Xavier Amatriain described 10 lessons that he has learned about building machine learning systems as the Research/Engineering Manager at Netflix. In this you will discover these 10 lessons in a summary from his talk and slides. Lessons Learned from Building Machine Learning Systems Taken from Xavier’s presentation 10 Lessons Learned The 10 lessons that Xavier presents can be summarized as follows: More data vs./and Better Models You might […]

Read more

Better Naive Bayes: 12 Tips To Get The Most From The Naive Bayes Algorithm

Last Updated on August 12, 2019 Naive Bayes is a simple and powerful technique that you should be testing and using on your classification problems. It is simple to understand, gives good results and is fast to build a model and make predictions. For these reasons alone you should take a closer look at the algorithm. In a recent blog post, you learned how to implement the Naive Bayes algorithm from scratch in python. In this post you will learn tips […]

Read more
1 771 772 773 774 775 906