Predicting Movie Genres using NLP – An Awesome Introduction to Multi-Label Classification

Introduction I was intrigued going through this amazing article on building a multi-label image classification model last week. The data scientist in me started exploring possibilities of transforming this idea into a Natural Language Processing (NLP) problem. That article showcases computer vision techniques to predict a movie’s genre. So I had to find a way to convert that problem statement into text-based data. Now, most NLP tutorials look at solving single-label classification challenges (when there’s only one label per observation). […]

Read more

10 Powerful Applications of Linear Algebra in Data Science (with Multiple Resources)

Overview Linear algebra powers various and diverse data science algorithms and applications Here, we present 10 such applications where linear algebra will help you become a better data scientist We have categorized these applications into various fields – Basic Machine Learning, Dimensionality Reduction, Natural Language Processing, and Computer Vision   Introduction If Data Science was Batman, Linear Algebra would be Robin. This faithful sidekick is often ignored. But in reality, it powers major areas of Data Science including the hot […]

Read more

7 Amazing NLP Hack Sessions to Watch out for at DataHack Summit 2019

Picture a world where: Machines are able to have human-level conversations with us Computers understand the context of the conversation without having to be told what the subject is These machines can even write full-blown essays after being given the theme of the topic This isn’t a movie script or a futuristic scenario – this is all happening right now thanks to the power of Natural Language Processing (NLP)! Here’s the incredible rise charted by Google Trends in the last […]

Read more

Build Your First Text Classification model using PyTorch

Overview Learn how to perform text classification using PyTorch Grasp the importance of Pack Padding feature Understand the key points involved while solving text classification Introduction I always turn to State of the Art architectures to make my first submission in data science hackathons. Implementing the State of the Art architectures has become quite easy thanks to deep learning frameworks such as PyTorch, Keras, and TensorFlow. These frameworks provide an easy way to implement complex model architectures and algorithms with […]

Read more

Top 10 Applications of Natural Language Processing (NLP)

Introduction Natural Language Processing is among the hottest topic in the field of data science. Companies are putting tons of money into research in this field. Everyone is trying to understand Natural Language Processing and its applications to make a career around it. Every business out there wants to integrate it into their business somehow. Do you know why?   Because just in a few years’ time span, natural language processing has evolved into something so powerful and impactful, which […]

Read more

Machine Learning in Cyber Security — Malicious Software Installation

Introduction Monitoring of user activities performed by local administrators is always a challenge for SOC analysts and security professionals. Most of the security framework will recommend the implementation of a whitelist mechanism. However, the real world is often not ideal. You will always have different developers or users having local administrator rights to bypass controls specified. Is there a way to monitor the local administrator activities?

Read more

Save Plot as Image with Matplotlib

Introduction Matplotlib is one of the most widely used data visualization libraries in Python. It’s common to share Matplotlib plots and visualizations with others. In this article, we’ll take a look at how to save a plot/graph as an image file using Matplotlib. Creating a Plot Let’s first create a simple plot: import matplotlib.pyplot as plt import numpy as np x = np.arange(0, 10, 0.1) y = np.sin(x) plt.plot(x, y) plt.show() Here, we’ve plotted a sine function, starting at 0 […]

Read more

Machine Translation Weekly 44: Tangled up in BLEU (and not blue)

For quite a while, machine translation is approached as a behaviorist simulation. Don’t you know what a good translation is? It does not matter, you can just simulate what humans do. Don’t you know how to measure if something is a good translation? It does not matter, you can simulate what humans do again. Things seem easy. We learn how to translate from tons of training data that were translated by humans. When we want to measure how well the […]

Read more

Machine Translation Weekly 45: Deep Encoder, Shallow Decoder, and the Fall of Non-autoregressive models

Researchers concerned with machine translation speed invented several methods that are supposed to significantly speed up the translation while maintaining as much as possible from the translation quality of the state-of-the-art models. The methods are usually based on generating as many words as possible in parallel. State-of-the-art models do not generate in parallel, they are autoregressive: it means that they generate words one by one and condition the decisions about the next words on the previously generated words. On the […]

Read more

Python with Pandas: DataFrame Tutorial with Examples

Introduction Pandas is an open-source Python library for data analysis. It is designed for efficient and intuitive handling and processing of structured data. The two main data structures in Pandas are Series and DataFrame. Series are essentially one-dimensional labeled arrays of any type of data, while DataFrames are two-dimensional, with potentially heterogenous data types, labeled arrays of any type of data. Heterogenous means that not all “rows” need to be of equal size. In this article we will go through […]

Read more
1 755 756 757 758 759 911