Recommender Engine – Under The Hood

Many of us are bombarded with various recommendations in our day to day life, be it on e-commerce sites or social media sites. Some of the recommendations look relevant but some create range of emotions in people, varying from confusion to anger. There are basically two types of recommender systems, Content based and Collaborative filtering. Both have their pros and cons depending upon the context in which you want to use them. Content based: In content based recommender systems, keywords […]

Read more

Machine Learning with Signal Processing Techniques

Stochastic Signal Analysis is a field of science concerned with the processing, modification and analysis of (stochastic) signals. Anyone with a background in Physics or Engineering knows to some degree about signal analysis techniques, what these technique are and how they can be used to analyze, model and classify signals. Data Scientists coming from a different fields, like Computer Science or Statistics, might not be aware of the analytical power these techniques bring with them. In this blog post, we […]

Read more

How to Execute R and Python in SQL Server with Machine Learning Services

Introduction Did you know that you can write R and Python code within your T-SQL statements? Machine Learning Services   in SQLServer eliminates the need for data movement. Instead of transferring large and sensitive data over the network or losing accuracy with sample csv files, you can have your R/Python code execute within your database. Easily deploy your R/Python code with SQL stored procedures making them accessible in your ETL processes or to any application. Train and store machine learning models […]

Read more

PixieDust Support of Streaming Data

With the rise of IoT devices (Internet of Things), being able to analyze and visualize live streams of data is becoming more and more important. For example, you could have sensors like thermometers in machines or portable medical devices like pacemakers, continuously streaming data to a streaming service like Kafka. PixieDust makes it easier to work with live data inside Jupyter Notebooks by providing simple integration APIs to both the PixieApp and display() framework.   On the visualization level, PixieDust […]

Read more

Career Transition Towards Data Science: Planning a Learning Sabbatical

At the time of writing this post, I am nine months into my learning sabbatical. You can read about my journey here: “Career Transition Towards Data Analytics & Science”. Today I will share with you how you can plan your own, unique learning sabbatical, regardless of its scope and duration – anywhere between 1 and 12 months. Let’s get started. Begin with the end in mind If you have ever read Stephen Covey’s “7 Habits of Highly Effective People” you […]

Read more

Why Excel Users Should Learn Python

Latest update: November 16, 2018 Microsoft Excel has been around for over 30 years now, and chances are it’s not going to change in the foreseeable future. In fact, Excel is facing immense competition from challengers such as Google Spreadsheets and well-funded start-ups like Airtable, which are both going after Excel’s massive user base of approximately 500 million worldwide. Tech-savvy small and mid-sized businesses embrace innovative alternatives to Excel. However, making a dent in the large enterprise space is a […]

Read more

Should Python Become Your Official Corporate Language, Along With English?

English is becoming the official language in the global business world, being currently spoken by approximately 1.75 billion people worldwide according to Harvard Business Review. While English is the fastest spreading language in human history, a significant proportion of businesses are still resistant to giving up on their native language. Just try having a casual conversation in English with German employees at their corporate headquarters canteen (I am German, just for the record). However, pressures are piling up, not only […]

Read more

Mental Framework For A Data Driven Digital Transformation

Over the last years, my small business has undergone a digital transformation from a marketing service company to a data literacy consultancy. What does a data literacy consultancy do? We teach business users within large enterprises to work with data, and we help them acquire the necessary skills from state of the art Excel to Python, querying structured, semi-structured and unstructured databases, as well as math, statistics, and probability. Throughout our transition, we applied a set of techniques, principles, and […]

Read more

Starting to develop in PySpark with Jupyter installed in a Big Data Cluster

Is not a secret that Data Science tools like Jupyter, Apache Zeppelin or the more recently launched Cloud Data Lab and Jupyter Lab are a must be known for the day by day work so How could be combined the power of easily developing models and the capacity of computation of a Big Data Cluster? Well in this article I will share very simple step to start using Jupyter notebooks for PySpark in a Data Proc Cluster in GCP. Final goal Prerequisites 1. Have a Google Cloud […]

Read more

Why I Am Writing At Data Science Central, And Why You Should, Too

My writing engagement at Data Science Central came up unexpectedly. Back in August 2018, I stumbled upon an excellent write-up on Data Science Central. The author, Bill Vorhies, shared his thoughts on career transitioning toward data science. I wrote him an email, complimenting him on his blog post, and I dropped a few lines about my own transition. Here’s his response: “Congratulations on your remarkable journey. Perhaps you’d like to write one or more articles around this theme as we […]

Read more
1 40 41 42 43 44 54