The Seductive Trap of Black-Box Machine Learning

Last Updated on April 4, 2018 For as long as I have been participating in data mining and machine learning competitions, I have thought about automating my participation. Maybe it shows that I want to solve the problem of building the tool more than I want to solve the problem at hand. When working on a dataset, I typically spend a disproportionate amount of time thinking about algorithm tuning and running tuning experiments. I am prone to performing post-competition analysis […]

Read more

Best Programming Language for Machine Learning

Last Updated on September 27, 2016 A question I get asked a lot is: What is the best programming language for machine learning? I’ve replied to this question many times now it’s about time to explore this further in a blog post. Ultimately, the programming language you use for machine learning should consider your own requirements and predilections. No one can meaningfully address those concerns for you. No one can meaningfully address those concerns for you. What Languages Are Being Used Before […]

Read more

How to Layout and Manage Your Machine Learning Project

Last Updated on June 7, 2016 Project layout is critical for machine learning projects just as it is for software development projects. I think of it like language. A project layout organizes thoughts and gives you context for ideas just like knowing the names for things gives you the basis for thinking. In this post I want to highlight some considerations in the layout and management of your machine learning project. This is very much related to the goals of […]

Read more

The Data Analytics Handbook: CEOs and Managers

Last Updated on August 15, 2020 In a previous blog post we looked at the ebook of interviews with data analysts and data scientists put together by Liou, Tao and Lin. In this blog post we look at the second book in the series titled The Data Analytics Handbook CEOs and Managers. The Data Analytics Handbook CEOs and Managers What are managers looking for in a Data Analyst and a Data Science position, what skills do they require and how do […]

Read more

Lessons for Machine Learning from Econometrics

Last Updated on August 15, 2020 Hal Varian is the chief economist at Google and gave a talk to Electronic Support Group at EECS Department at the University of California at Berkeley in November 2013. The talk was titled Machine Learning and Econometrics and was really focused on what lessons the machine learning can take away from the field of Econometrics. Hal started out by summarizing a recent paper of his titled “Big Data: New Tricks for Econometrics” (PDF) which […]

Read more

Bootstrapping Machine Learning: Book Review

Last Updated on June 7, 2016 Louis Dorard has released his book titled Bootstrapping Machine Learning. It’s a book that provides a gentle introduction to the field of machine learning targeted at developers and start-ups with a focus on prediction APIs. I just finished reading this book and I want to share some my thoughts. If you are interested, I have already reviewed the sample Louis provides on his webpage that covers the first two chapters. Bootstrapping Machine Learning Overview […]

Read more

Machine Learning that Matters

Last Updated on September 5, 2016 Reading bootstrapping machine learning, Louis mentioned a paper that I had to go off and read. The title of the paper is Machine Learning that Matters (PDF) by Kiri Wagstaff from JPL and was published in 2012. Machine Learning that Matters Kiri’s thesis is that the machine learning research community has lost its way. She suggests that much of machine learning is done for machine learning’s sake. She points to three key problems: Overfocus on […]

Read more

Machine Learning with Quantum Computers

Last Updated on June 17, 2019 I recently watched a Google Tech Talk with Eric Ladizinsky who visited the Quantum AI Lab at Google to talk about his D-Wave quantum computer. The talk is called Evolving Scalable Quantum Computers and is great, I highly recommend it. I’ve had quantum computing on my mind and another tech talk went by titled Quantum Machine Learning and I had to jump on it. The talk is by Seth Lloyd from MIT. The talk […]

Read more

The Data Analytics Handbook: Researchers and Academics Review

Last Updated on June 7, 2016 What is the difference between a Data Analyst and a Data Scientist. This question is considered from the perspective of researchers and academics in the third instalment in the series of The Data Analytics Handbook. The first book contained 7 interviews with working analysts and data scientists. The second book contained 9 interviews with CEOs and managers. This third book in the series contains 8 interviews with academics and researchers and is called The Data Analytics Handbook: Researchers and […]

Read more

Data Cleaning: Turn Messy Data into Tidy Data

Last Updated on August 16, 2020 Data preparation is difficult because the process is not objective, or at least it does not feel that way. Questions like “what is the best form of the data to describe the problem?” are not objective. You have to think from the perspective of the problem you want to solve and try a few different representations through your pipeline. Hadley Wickham is the Adjunct Professor at Rice University and Chief Scientist and RStudio and […]

Read more
1 768 769 770 771 772 911