Unsupervised Translation of Programming Languages

Abstract A transcompiler, also known as source-to-source translator, is a system that converts source code from a high-level programming language (such as C++ or Python) to another. Transcompilers are primarily used for interoperability, and to port codebases written in an obsolete or deprecated language (e.g. COBOL, Python 2) to a modern one. They typically rely on handcrafted rewrite rules, applied to the source code abstract syntax tree. Unfortunately, the resulting translations often lack readability, fail to respect the target language […]

Read more

Text Mining hack: Subject Extraction made easy using Google API

Let’s do a simple exercise. You need to identify the subject and the sentiment in following sentences: Google is the best resource for any kind of information. I came across a fabulous knowledge portal – Analytics Vidhya Messi played well but Argentina still lost the match Opera is not the best browser Yes, like UAE will win the Cricket World Cup. Was this exercise simple? Even if this looks like a simple exercise, now imagine creating an algorithm to do this? How does that […]

Read more

Artificial Intelligence Demystified

Introduction Artificial Intelligence has become a very popular term today. There is sure to be at least one article in the newspaper daily on the revolutionary advancements made in the field. But, there seems to be some confusion about what AI really is. Is it Robotics? Will the Terminator movie actually come true? Or is it something that has crept into our daily lives without us even realizing it? This article will give you a broad understanding on the buzzwords […]

Read more

Text Classification & Word Representations using FastText (An NLP library by Facebook)

Introduction If you put a status update on Facebook about purchasing a car -don’t be surprised if Facebook serves you a car ad on your screen. This is not black magic! This is Facebook leveraging the text data to serve you better ads. The picture below takes a jibe at a challenge while dealing with text data. Well, it clearly failed in the above attempt to deliver the right ad. It is all the more important to capture the context […]

Read more

Ultimate guide to deal with Text Data (using Python) – for Data Scientists and Engineers

Introduction One of the biggest breakthroughs required for achieving any level of artificial intelligence is to have machines which can process text data. Thankfully, the amount of text data being generated in this universe has exploded exponentially in the last few years. It has become imperative for an organization to have a structure in place to mine actionable insights from the text being generated. From social media analytics to risk management and cybercrime protection, dealing with text data has never […]

Read more

Machine Translation Weekly 58: Poisoning machine translation

Today, I am going to talk about a topic that is rather unknown to me: the safety and vulnerability of machine translation. I will comment on a paper Targeted Poisoning Attacks on Black-Box Neural Machine Translation by authors from the University of Melbourne and Facebook AI. The main issue making machine-translation users vulnerable is that they typically do not understand the target language and do not have any other choice than trusting the system that target-language output is adequate. Most […]

Read more

Comprehensive Hands on Guide to Twitter Sentiment Analysis with dataset and code

Introduction Natural Language Processing (NLP) is a hotbed of research in data science these days and one of the most common applications of NLP is sentiment analysis. From opinion polls to creating entire marketing strategies, this domain has completely reshaped the way businesses work, which is why this is an area every data scientist must be familiar with. Thousands of text documents can be processed for sentiment (and other features including named entities, topics, themes, etc.) in seconds, compared to […]

Read more

The 15 Most Popular Data Science and Machine Learning Articles on Analytics Vidhya in 2018

Introduction What is the one thing you enjoy most about Analytics Vidhya? The most popular answer we receive (and have received since Kunal transformed his idea into reality) is the content we publish. Our content is the one thing take pride in, and 2018 saw us take our high-quality content to a whole new level. We launched multiple top-quality and popular training courses, published knowledge-rich machine learning and deep learning articles and guides, and saw our blog visits cross 2.5 million […]

Read more

8 Excellent Pretrained Models to get you Started with Natural Language Processing (NLP)

Introduction Natural Language Processing (NLP) applications have become ubiquitous these days. I seem to stumble across websites and applications regularly that are leveraging NLP in one form or another. In short, this is a wonderful time to be involved in the NLP domain. This rapid increase in NLP adoption has happened largely thanks to the concept of transfer learning enabled through pretrained models. Transfer learning, in the context of NLP, is essentially the ability to train a model on one dataset […]

Read more

How to Get Started with NLP – 6 Unique Methods to Perform Tokenization

Overview Looking to get started with Natural Language Processing (NLP)? Here’s the perfect first step Learn how to perform tokenization – a key aspect to preparing your data for building NLP models We present 6 different ways to perform tokenization on text data   Introduction Are you fascinated by the amount of text data available on the internet? Are you looking for ways to work with this text data but aren’t sure where to begin? Machines, after all, recognize numbers, […]

Read more
1 723 724 725 726 727 906