Articles About Machine Learning

An Intuitive Understanding of Word Embeddings: From Count Vectors to Word2Vec

Introduction Before we start, have a look at the below examples. You open Google and search for a news article on the ongoing Champions trophy and get hundreds of search results in return about it. Nate silver analysed millions of tweets and correctly predicted the results of 49 out of 50 states in 2008 U.S Presidential Elections. You type a sentence in google translate in English and get an Equivalent Chinese conversion.   So what do the above examples have […]

Read more

The Ultimate Learning Path to Becoming a Data Scientist in 2018

Introduction So you’ve taken the plunge. You want to become a data scientist. But where to begin? There are far too many resources out there. How do you decide the starting point? Did you miss out on topics you should have studied? Which are the best resources to learn? Don’t worry, we have you covered! Analytics Vidhya’s learning path for 2016 saw 250,000+ views. In 2017, we went even further and saw an incredible 500,000+ views! So this year, we […]

Read more

Tutorial on Text Classification (NLP) using ULMFiT and fastai Library in Python

Introduction Natural Language Processing (NLP) needs no introduction in today’s world. It’s one of the most important fields of study and research, and has seen a phenomenal rise in interest in the last decade. The basics of NLP are widely known and easy to grasp. But things start to get tricky when the text data becomes huge and unstructured. That’s where deep learning becomes so pivotal. Yes, I’m talking about deep learning for NLP tasks – a still relatively less […]

Read more

PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents

Neural Machine Translation (NMT) has shown drastic improvement in its quality when translating clean input, such as text from the news domain. However, existing studies suggest that NMT still struggles with certain kinds of input with considerable noise, such as User-Generated Contents (UGC) on the Internet… To make better use of NMT for cross-cultural communication, one of the most promising directions is to develop a model that correctly handles these expressions. Though its importance has been recognized, it is still […]

Read more

Handwriting Classification for the Analysis of Art-Historical Documents

Digitized archives contain and preserve the knowledge of generations of scholars in millions of documents. The size of these archives calls for automatic analysis since a manual analysis by specialists is often too expensive… In this paper, we focus on the analysis of handwriting in scanned documents from the art-historic archive of the WPI. Since the archive consists of documents written in several languages and lacks annotated training data for the creation of recognition models, we propose the task of […]

Read more

Pixel-wise Dense Detector for Image Inpainting

Recent GAN-based image inpainting approaches adopt an average strategy to discriminate the generated image and output a scalar, which inevitably lose the position information of visual artifacts. Moreover, the adversarial loss and reconstruction loss (e.g., l1 loss) are combined with tradeoff weights, which are also difficult to tune… In this paper, we propose a novel detection-based generative framework for image inpainting, which adopts the min-max strategy in an adversarial process. The generator follows an encoder-decoder architecture to fill the missing […]

Read more

Lightweight Model For The Prediction of COVID-19 Through The Detection And Segmentation of Lesions in Chest CT Scans

We introduce a lightweight Mask R-CNN model that segments areas with the Ground Glass Opacity and Consolidation in chest CT scans. The model uses truncated ResNet18 and ResNet34 nets with a single layer of Feature Pyramid Network as a backbone net, thus substantially reducing the number of the parameters and the training time compared to similar solutions using deeper networks… Without any data balancing and manipulations, and using only a small fraction of the training data, COVID-CT-Mask-Net classification model with […]

Read more

BGGAN: Bokeh-Glass Generative Adversarial Network for Rendering Realistic Bokeh

A photo captured with bokeh effect often means objects in focus are sharp while the out-of-focus areas are all blurred. DSLR can easily render this kind of effect naturally… However, due to the limitation of sensors, smartphones cannot capture images with depth-of-field effects directly. In this paper, we propose a novel generator called Glass-Net, which generates bokeh images not relying on complex hardware. Meanwhile, the GAN-based method and perceptual loss are combined for rendering a realistic bokeh effect in the […]

Read more

Stochastic Hard Thresholding Algorithms for AUC Maximization

In this paper, we aim to develop stochastic hard thresholding algorithms for the important problem of AUC maximization in imbalanced classification. The main challenge is the pairwise loss involved in AUC maximization… We overcome this obstacle by reformulating the U-statistics objective function as an empirical risk minimization (ERM), from which a stochastic hard thresholding algorithm (texttt{SHT-AUC}) is developed. To our best knowledge, this is the first attempt to provide stochastic hard thresholding algorithms for AUC maximization with a per-iteration cost […]

Read more

A deep learning classifier for local ancestry inference

Local ancestry inference (LAI) identifies the ancestry of each segment of an individual’s genome and is an important step in medical and population genetic studies of diverse cohorts. Several techniques have been used for LAI, including Hidden Markov Models and Random Forests… Here, we formulate the LAI task as an image segmentation problem and develop a new LAI tool using a deep convolutional neural network with an encoder-decoder architecture. We train our model using complete genome sequences from 982 unadmixed […]

Read more
1 104 105 106 107 108 226