Understand Your Problem and Get Better Results Using Exploratory Data Analysis

Last Updated on August 15, 2020 You often jump from problem-to-problem in applied machine learning and you need to get up to speed on a new dataset, fast. A classical and under-utilised approach that you can use to quickly build a relationship with a new data problem is Exploratory Data Analysis. In this post you will discover Exploratory Data Analysis (EDA), the techniques and tactics that you can use and why you should be performing EDA on your next problem. […]

Read more

Evaluate Yourself As a Data Scientist

Last Updated on August 15, 2020 What skills do you need to be a data scientist? I read an interesting data-driven approach to answering this question in the book Doing Data Science: Straight Talk from the Frontline. In this post I summarize this self-assessment approach that you can use to evaluate your strengths as a data scientist and where you might fit into an amazing data science team. You can use applied machine learning practitioner as a synonym for data […]

Read more

Assessing and Comparing Classifier Performance with ROC Curves

Last Updated on March 5, 2020 The most commonly reported measure of classifier performance is accuracy: the percent of correct classifications obtained. This metric has the advantage of being easy to understand and makes comparison of the performance of different classifiers trivial, but it ignores many of the factors which should be taken into account when honestly assessing the performance of a classifier. What Is Meant By Classifier Performance? Classifier performance is more than just a count of correct classifications. […]

Read more

Get Your Dream Job in Machine Learning by Delivering Results

Last Updated on June 7, 2016 You can rise up and take on your desire to become an a machine learning practitioner and data scientist. You have to work hard, learn the skills and demonstrate that you can deliver results, but you don’t need a fancy degree or a fancy background. In this post I want to demonstrate that this is possible and even common. You will discover that top managers and CEOs are looking at results and not backgrounds […]

Read more

Lessons Learned from Building Machine Learning Systems

Last Updated on September 5, 2016 In a recent presentation at MLConf, Xavier Amatriain described 10 lessons that he has learned about building machine learning systems as the Research/Engineering Manager at Netflix. In this you will discover these 10 lessons in a summary from his talk and slides. Lessons Learned from Building Machine Learning Systems Taken from Xavier’s presentation 10 Lessons Learned The 10 lessons that Xavier presents can be summarized as follows: More data vs./and Better Models You might […]

Read more

Better Naive Bayes: 12 Tips To Get The Most From The Naive Bayes Algorithm

Last Updated on August 12, 2019 Naive Bayes is a simple and powerful technique that you should be testing and using on your classification problems. It is simple to understand, gives good results and is fast to build a model and make predictions. For these reasons alone you should take a closer look at the algorithm. In a recent blog post, you learned how to implement the Naive Bayes algorithm from scratch in python. In this post you will learn tips […]

Read more

Use Random Forest: Testing 179 Classifiers on 121 Datasets

Last Updated on July 31, 2020 If you don’t know what algorithm to use on your problem, try a few. Alternatively, you could just try Random Forest and maybe a Gaussian SVM. In a recent study these two algorithms were demonstrated to be the most effective when raced against nearly 200 other algorithms averaged over more than 100 data sets. In this post we will review this study and consider some implications for testing algorithms on our own applied machine […]

Read more

Practical Machine Learning Books for the Holidays

Last Updated on August 16, 2020 O’Reilly books have a reputation for being practical, hands on and useful. Specifically the nutshell books and so-called animal books. O’Reilly have a few new books out in time for the holidays on the topic of machine learning. I don’t want to bore you with reviews, Amazon has plenty of those. In this post we take a quick look at these new machine learning books and see what might be worth reading in the holiday […]

Read more

How To Get Started In Machine Learning: A Self-Study Blueprint

Last Updated on June 7, 2016 How do you get started in machine learning, specifically Deep Learning? This question was asked recently in the machine learning sub-reddit. Specifically, the original poster of the question had completed the Coursera Machine Learning course but felt like they did not have enough of a background to get started in Deep Learning. I wrote a lengthy reply that I think may be helpful more generally, for other people in the same situation that are […]

Read more

How To Work Through A Problem Like A Data Scientist

Last Updated on August 15, 2020 In a 2010 post Hilary Mason and Chris Wiggins described the OSEMN process as a taxonomy of tasks that a data scientist should feel comfortable working on. The title of the post was “A Taxonomy of Data Science” on the now defunct dataists blog. This process has also been used as the structure of a recent book, specifically “Data Science at the Command Line: Facing the Future with Time-Tested Tools” by Jeroen Janssens published […]

Read more
1 776 777 778 779 780 911