Articles About Machine Learning

An Unsupervised method for OCR Post-Correction and Spelling Normalisation for Finnish

Historical corpora are known to contain errors introduced by OCR (optical character recognition) methods used in the digitization process, often said to be degrading the performance of NLP systems. Correcting these errors manually is a time-consuming process and a great part of the automatic approaches have been relying on rules or supervised machine learning… We build on previous work on fully automatic unsupervised extraction of parallel data to train a character-based sequence-to-sequence NMT (neural machine translation) model to conduct OCR […]

Read more

Artificial Intelligence Demystified

Introduction Artificial Intelligence has become a very popular term today. There is sure to be at least one article in the newspaper daily on the revolutionary advancements made in the field. But, there seems to be some confusion about what AI really is. Is it Robotics? Will the Terminator movie actually come true? Or is it something that has crept into our daily lives without us even realizing it? This article will give you a broad understanding on the buzzwords […]

Read more

Text Classification & Word Representations using FastText (An NLP library by Facebook)

Introduction If you put a status update on Facebook about purchasing a car -don’t be surprised if Facebook serves you a car ad on your screen. This is not black magic! This is Facebook leveraging the text data to serve you better ads. The picture below takes a jibe at a challenge while dealing with text data. Well, it clearly failed in the above attempt to deliver the right ad. It is all the more important to capture the context […]

Read more

Ultimate guide to deal with Text Data (using Python) – for Data Scientists and Engineers

Introduction One of the biggest breakthroughs required for achieving any level of artificial intelligence is to have machines which can process text data. Thankfully, the amount of text data being generated in this universe has exploded exponentially in the last few years. It has become imperative for an organization to have a structure in place to mine actionable insights from the text being generated. From social media analytics to risk management and cybercrime protection, dealing with text data has never […]

Read more

Comprehensive Hands on Guide to Twitter Sentiment Analysis with dataset and code

Introduction Natural Language Processing (NLP) is a hotbed of research in data science these days and one of the most common applications of NLP is sentiment analysis. From opinion polls to creating entire marketing strategies, this domain has completely reshaped the way businesses work, which is why this is an area every data scientist must be familiar with. Thousands of text documents can be processed for sentiment (and other features including named entities, topics, themes, etc.) in seconds, compared to […]

Read more

The 15 Most Popular Data Science and Machine Learning Articles on Analytics Vidhya in 2018

Introduction What is the one thing you enjoy most about Analytics Vidhya? The most popular answer we receive (and have received since Kunal transformed his idea into reality) is the content we publish. Our content is the one thing take pride in, and 2018 saw us take our high-quality content to a whole new level. We launched multiple top-quality and popular training courses, published knowledge-rich machine learning and deep learning articles and guides, and saw our blog visits cross 2.5 million […]

Read more

Anomalous diffusion in nonlinear transformations of the noisy voter model

Voter models are well known in the interdisciplinary community, yet they haven’t been studied from the perspective of anomalous diffusion. In this paper we show that the original voter model exhibits ballistic regime… Non-linear transformations of the observation variable and time scale allows us to observe other regimes of anomalous diffusion as well as normal diffusion. We show that numerical simulation results coincide with derived analytical approximations describing the temporal evolution of the raw moments. (read more) PDF

Read more

Anomalous Sound Detection as a Simple Binary Classification Problem with Careful Selection of Proxy Outlier Examples

Unsupervised anomalous sound detection is concerned with identifying sounds that deviate from what is defined as ‘normal’, without explicitly specifying the types of anomalies. A significant obstacle is the diversity and rareness of outliers, which typically prevent us from collecting a representative set of anomalous sounds… As a consequence, most anomaly detection methods use unsupervised rather than supervised machine learning methods. Nevertheless, we will show that anomalous sound detection can be effectively framed as a supervised classification problem if the […]

Read more

This Looks Like That, Because … Explaining Prototypes for Interpretable Image Recognition

Image recognition with prototypes is considered an interpretable alternative for black box deep learning models. Classification depends on the extent to which a test image “looks like” a prototype… However, perceptual similarity for humans can be different from the similarity learnt by the model. A user is unaware of the underlying classification strategy and does not know which image characteristics (e.g., color or shape) is the dominant characteristic for the decision. We address this ambiguity and argue that prototypes should […]

Read more

Revisiting Stereo Depth Estimation From a Sequence-to-Sequence Perspective with Transformers

Stereo depth estimation relies on optimal correspondence matching between pixels on epipolar lines in the left and right image to infer depth. Rather than matching individual pixels, in this work, we revisit the problem from a sequence-to-sequence correspondence perspective to replace cost volume construction with dense pixel matching using position information and attention… This approach, named STereo TRansformer (STTR), has several advantages: It 1) relaxes the limitation of a fixed disparity range, 2) identifies occluded regions and provides confidence of […]

Read more
1 101 102 103 104 105 226