Analyzing API Data with MongoDB, Seaborn, and Matplotlib

Introduction A commonly requested skill for software development positions is experience with NoSQL databases, including MongoDB. This tutorial will explore collecting data using an API, storing it in a MongoDB database, and doing some analysis of the data. However, before jumping into the code let’s take a moment to go over MongoDB and APIs, to make sure we understand how we’ll be dealing with the data we’re collecting. MongoDB and NoSQL MongoDB is a form of NoSQL database, enabling the […]

Read more

Ensemble/Voting Classification in Python with Scikit-Learn

Introduction Ensemble classification models can be powerful machine learning tools capable of achieving excellent performance and generalizing well to new, unseen datasets. The value of an ensemble classifier is that, in joining together the predictions of multiple classifiers, it can correct for errors made by any individual classifier, leading to better accuracy overall. Let’s take a look at the different ensemble classification methods and see how these classifiers can be implemented in Scikit-Learn. What are Ensemble Models in Machine Learning? Credit: Pixabay […]

Read more

One-Hot Encoding in Python with Pandas and Scikit-Learn

Introduction In computer science, data can be represented in a lot of different ways, and naturally, every single one of them has its advantages as well as disadvantages in certain fields. Since computers are unable to process categorical data as these categories have no meaning for them, this information has to be prepared if we want a computer to be able to process it. This action is called preprocessing. A big part of preprocessing is encoding – representing every single […]

Read more

Calculating Mean, Median, and Mode in Python

Introduction When we’re trying to describe and summarize a sample of data, we probably start by finding the mean (or average), the median, and the mode of the data. These are central tendency measures and are often our first look at a dataset. In this tutorial, we’ll learn how to find or compute the mean, the median, and the mode in Python. We’ll first code a Python function for each measure followed by using Python’s statistics module to accomplish the […]

Read more

Statistical Hypothesis Analysis in Python with ANOVAs, Chi-Square, and Pearson Correlation

Introduction Python is an incredibly versatile language, useful for a wide variety of tasks in a wide range of disciplines. One such discipline is statistical analysis on datasets, and along with SPSS, Python is one of the most common tools for statistics. Python’s user-friendly and intuitive nature makes running statistical tests and implementing analytical techniques easy, especially through the use of the statsmodels library. Introducing The statsmodels Library In Python The statsmodels library is a module for Python that gives […]

Read more

Deep Learning in Keras – Data Preprocessing

Introduction Deep learning is one of the most interesting and promising areas of artificial intelligence (AI) and machine learning currently. With great advances in technology and algorithms in recent years, deep learning has opened the door to a new era of AI applications. In many of these applications, deep learning algorithms performed equal to human experts and sometimes surpassed them. Python has become the go-to language for Machine Learning and many of the most popular and powerful deep learning libraries […]

Read more

Open Source Deep Learning Frameworks and Visual Analytics

Deep Learning gets more and more traction. It basically focuses on one section of Machine Learning: Artificial Neural Networks. This article explains why Deep Learning is a game changer in analytics, when to use it, and how Visual Analytics allows business analysts to leverage the analytic models built by a (citizen) data scientist. What is Deep Learning and Artificial Neural Networks? Deep Learning is the modern buzzword for artificial neural networks, one of many concepts and algorithms in machine learning […]

Read more

How to Execute R and Python in SQL Server with Machine Learning Services

Introduction Did you know that you can write R and Python code within your T-SQL statements? Machine Learning Services   in SQLServer eliminates the need for data movement. Instead of transferring large and sensitive data over the network or losing accuracy with sample csv files, you can have your R/Python code execute within your database. Easily deploy your R/Python code with SQL stored procedures making them accessible in your ETL processes or to any application. Train and store machine learning models […]

Read more

Creating a Simple Recommender System in Python using Pandas

Introduction Have you ever wondered how Netflix suggests movies to you based on the movies you have already watched? Or how does an e-commerce websites display options such as “Frequently Bought Together”? They may look relatively simple options but behind the scenes, a complex statistical algorithm executes in order to predict these recommendations. Such systems are called Recommender Systems, Recommendation Systems, or Recommendation Engines. A Recommender System is one of the most famous applications of data science and machine learning. […]

Read more

NumPy Tutorial: A Simple Example-Based Guide

Introduction The NumPy library is a popular Python library used for scientific computing applications, and is an acronym for “Numerical Python”. NumPy’s operations are divided into three main categories: Fourier Transform and Shape Manipulation, Mathematical and Logical Operations, and Linear Algebra and Random Number Generation. To make it as fast as possible, NumPy is written in C and Python. In this article, we will provide a brief introduction to the NumPy stack and we will see how the NumPy library […]

Read more
1 9 10 11 12