An indispensable Python : Data sourcing to Data science.

Data analysis echo system has grown all the way from SQL’s to NoSQL and from Excel analysis to Visualization. Today, we are in scarceness of the resources to process ALL (You better understand what i mean by ALL) kind of data that is coming to enterprise. Data goes through profiling, formatting, munging or cleansing, pruning, transformation steps to analytics and predictive modeling. Interestingly, there is no one tool proved to be an effective solution to run all these operations { Don’t forget the […]

Read more

Random Forests Algorithm

One of the most popular methods or frameworks used by data scientists at the Rose Data Science Professional Practice Group is Random Forests. The Random Forests algorithm is one of the best among classification algorithms – able to classify large amounts of data with accuracy. Random Forests are an ensemble learning method (also thought of as a form of nearest neighbor predictor) for classification and regression that construct a number of decision trees at training time and outputting the class that is […]

Read more

Python Scikit-learn to simplify Machine learning : { Bag of words } To [ TF-IDF ]

Text (word) analysis and tokenized text modeling always give a chill air around ears, specially when you are new to machine learning. Thanks to Python and its extended libraries for its warm support around text analytics and machine learning. Scikit-learn is a savior and excellent support in text processing when you also understand some of the concept like “Bag of word”, “Clustering” and “vectorization”. Vectorization is  must-to-know technique for all machine leaning learners, text miner and algorithm implementor. I personally consider […]

Read more

Plotly Beta: Graphing and Analytics Platform

Hey Data Scientists, I wanted to reach out about Plot.ly, a new startup for analyzing and beautifully visualizing data. We just launched a beta. It is built for math, science, and data applications. We’d love your thoughts. Overview:  You can import data from anywhere, and analyze it in our grid with stats, fits, functions, and more. Our plotting APIs (R, Python, MATLAB, Arduino, REST, Julia, Perl) and grid make interactive, web-ready, publication-quality graphs.  We have a Python Shell, and interactive graphs […]

Read more

New in Plotly: Interactive Graphs with IPython

New! Plotly lets you style interactive graphs in IPython. Then, you can share your Notebook or your Plotly graph. It’s like having the NYTimes graphics department inside your IPython. You can also get these Notebooks on the Plotly GitHub page. Visit Plot.ly to see more documentation.  Here’s a preview of how it looks to have your code, data, and graph all interactively available. See the live version. To finish reading, please visit source site

Read more

Data Science – learn R or Python?

Hi Folks, I have a query around whether to learn R from scratch or should I leverage my basic python knowledge to extend into Data Science with scikit,numpy ,pandas? So I am bit confused … I am not shy to learn New programming language like R etc bur really need to know who edges out whom in market. Maybe i should learn R too along with Python so  your valuable opinion matters.             Also i […]

Read more

Data Science In The Cloud With DataJoy

DataJoy is an unbelievably fantastic way for a working data scientist to have their favorite tools at hand. I am a minimalist when it comes to being mobile, whether working on the road, traveling for leisure, and sometimes both. I do not like to keep files on my laptop and I do not, for the most part, like to worry about keeping updated applications on my laptop. I have tried as much as possible to push my life into the […]

Read more

Picking an Analytic Platform

Summary: Picking an analytic platform when first starting out in data science almost always means working with what we’re most comfortable.  But as organizations grow larger there is a need for standardization and for selecting one, or a few analytic tools.   Picking an analytic platform when first starting out in data science almost always means working with what we’re most comfortable.  That in turn almost always means whatever we used in college (or your certificate course) be it R, […]

Read more

Machine Learning – Anomaly Detection: “Finding a Needle in a Haystack”

After exploring formulation, classification, benchmarking, we explore another facet of Machine Learning: anomaly detection. This part is key in the IoT transformation, as it enables internet-connected AI devices to alert, adapt and respond accordingly. Once properly trained, an IoT could not only warn and prevent imminent failure, but also execute a response, adaptive to the anomaly detected. In this process, we’ll explore intrinsic hurdles that makes the anomaly detection process a non-trivial task of “finding a needle in haystack”. Opportunities abound to explore, and any univariate sequential […]

Read more
1 881 882 883 884 885 912