Characteristics of Good Visual Analytics and Data Discovery Tools

Visual Analytics and Data Discovery allow analysis of big data sets to find insights and valuable information. This is much more than just classical Business Intelligence (BI). See this article for more details and motivation: “Using Visual Analytics to Make Better Decisions: the Death Pill Exa…“. Let’s take a look at important characteristics to choose the right tool for your use cases. Visual Analytics Tool Comparison and Evaluation Several tools are available on the market for Visual Analytics and Data […]

Read more

What’s That Beer Style? Ask a Neighbor, or Two

Beer is delicious but it is not one thing. If you disagree with the former part of the previous sentence please keep the latter in mind[1]. Think of sports, for instance. Many would agree with the blanket statement “sports are fun” but depending on what you have in mind two people can easily have opposite reactions to being presented the opportunity to play ping-pong. Sports are not one thing, music is not one thing, and neither is beer. Presented with […]

Read more

R, Python or SAS: Which one should you learn first?

Python, R and SAS are the three most popular languages in data science. If you are new to the world of data science and aren’t experienced in either of these languages, it makes sense to be unsure of whether to learn R, SAS or Python. Don’t fret, by the time you’re done reading this article, you will know without a doubt which language is the right one for you. Overview R – R is the lingua franca of statistics. It is a […]

Read more

Naive Bayes Classification explained with Python code

Machine Learning is a vast area of Computer Science that is concerned with designing algorithms which form good models of the world around us (the data coming from the world around us). Within Machine Learning many tasks are – or can be reformulated as – classification tasks. In classification tasks we are trying to produce a model which can give the correlation between the input data  and the class  each input belongs to. This model is formed with the feature-values of the input-data. […]

Read more

What are the Big Guys Using?

Summary:  The largest companies utilizing the most data science resources are moving rapidly toward more integrated advanced analytic platforms.  The features they are demanding are evolving to promote speed, simplicity, quality, and manageability.  This has some interesting implications for open source R and Python widely taught in schools but significantly less necessary with these more sophisticated platforms.   We continue to be dazzled, and perhaps rightly so, by the advances in deep learning and question answering machines like Watson.  And […]

Read more

Open Source Deep Learning Frameworks and Visual Analytics

Deep Learning gets more and more traction. It basically focuses on one section of Machine Learning: Artificial Neural Networks. This article explains why Deep Learning is a game changer in analytics, when to use it, and how Visual Analytics allows business analysts to leverage the analytic models built by a (citizen) data scientist. What is Deep Learning and Artificial Neural Networks? Deep Learning is the modern buzzword for artificial neural networks, one of many concepts and algorithms in machine learning […]

Read more

Learn Python for Data Science from Scratch

Python is a multipurpose programming language and widely used for Data Science, which is termed as the sexiest job of this century. Data Scientist mine thru the large dataset to gain insight and make meaningful data driven decisions. Python is used as general purposed programming language and used for Web Development, Networking, Scientific computing etc. We will be discussing further about the series of awesome libraries in python such as numpy, scipy & pandas for data manipulation & wrangling and […]

Read more

Why R is Bad for You

Summary:  Someone had to say it.  In my opinion R is not the best way to learn data science and not the best way to practice it either.  More and more large employers agree.   Someone had to say it.  I know this will be controversial and I welcome your comments but in my opinion R is not the best way to learn data science and not the best way to practice it either.   Why Should We Care What […]

Read more

Will Python Replace Java?

  According to the IT programming trend, Java is currently more popular than other programming languages in terms of number of jobs, number of existing Java developers and overall usage statics in IT compared to Python. According to the latest usage statistics posted on a popular Technology Survey site, Java is being used by 3.0% websites as a server-side programming language, whereas only 0.2% of websites use Python. However, all the recent reports have highlighted that the usage and popularity of Python […]

Read more

How to automatically create Base Line Estimators using scikit learn.

For any machine learning problem, say a classifier in this case, it’s always handy to create quickly a base line classifier against which we can compare our new models. You don’t want to spend a lot of time creating these base line classifiers; you would rather spend that time in building and validating new features for your final model. In this post we will see how we can rapidly create base line classifier using scikit learn package for any dataset. […]

Read more
1 882 883 884 885 886 912