A Quick Guide to Text Cleaning Using the nltk Library

This article was published as a part of the Data Science Blogathon. Introduction NLTK is a string processing library that takes strings as input. The output is in the form of either a string or lists of strings. This library provides a lot of algorithms that helps majorly in the learning purpose. One can compare among different variants of outputs. There are other libraries as well like spaCy, CoreNLP, PyNLPI, Polyglot. NLTK and spaCy are most widely used. Spacy works […]

Read more

Ranking Deep Learning Generalization using Label Variation in Latent Geometry Graphs

Measuring the generalization performance of a Deep Neural Network (DNN) without relying on a validation set is a difficult task. In this work, we propose exploiting Latent Geometry Graphs (LGGs) to represent the latent spaces of trained DNN architectures… Such graphs are obtained by connecting samples that yield similar latent representations at a given layer of the considered DNN. We then obtain a generalization score by looking at how strongly connected are samples of distinct classes in LGGs. This score […]

Read more

Fast Region Proposal Learning for Object Detection for Robotics

Object detection is a fundamental task for robots to operate in unstructured environments. Today, there are several deep learning algorithms that solve this task with remarkable performance… Unfortunately, training such systems requires several hours of GPU time. For robots, to successfully adapt to changes in the environment or learning new objects, it is also important that object detectors can be re-trained in a short amount of time. A recent method [1] proposes an architecture that leverages on the powerful representation […]

Read more

Supercharging Imbalanced Data Learning With Causal Representation Transfer

Dealing with severe class imbalance poses a major challenge for real-world applications, especially when the accurate classification and generalization of minority classes is of primary interest. In computer vision, learning from long tailed datasets is a recurring theme, especially for natural image datasets… While existing solutions mostly appeal to sampling or weighting adjustments to alleviate the pathological imbalance, or imposing inductive bias to prioritize non-spurious associations, we take novel perspectives to promote sample efficiency and model generalization based on the […]

Read more

Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for 3D Reconstruction

We present an efficient multi-view stereo (MVS) network for 3D reconstruction from multiview images. While previous learning based reconstruction approaches performed quite well, most of them estimate depth maps at a fixed resolution using plane sweep volumes with a fixed depth hypothesis at each plane, which requires densely sampled planes for desired accuracy and therefore is difficult to achieve high resolution depth maps… In this paper we introduce a coarseto-fine depth inference strategy to achieve high resolution depth. This strategy […]

Read more

Simple statistical methods for unsupervised brain anomaly detection on MRI are competitive to deep learning methods

Statistical analysis of magnetic resonance imaging (MRI) can help radiologists to detect pathologies that are otherwise likely to be missed. Deep learning (DL) has shown promise in modeling complex spatial data for brain anomaly detection… However, DL models have major deficiencies: they need large amounts of high-quality training data, are difficult to design and train and are sensitive to subtle changes in scanning protocols and hardware. Here, we show that also simple statistical methods such as voxel-wise (baseline and covariance) […]

Read more

The Geometry of Distributed Representations for Better Alignment, Attenuated Bias, and Improved Interpretability

High-dimensional representations for words, text, images, knowledge graphs and other structured data are commonly used in different paradigms of machine learning and data mining. These representations have different degrees of interpretability, with efficient distributed representations coming at the cost of the loss of feature to dimension mapping… This implies that there is obfuscation in the way concepts are captured in these embedding spaces. Its effects are seen in many representations and tasks, one particularly problematic one being in language representations […]

Read more

SAR-Net: A End-to-End Deep Speech Accent Recognition Network

This paper proposes a end-to-end deep network to recognize kinds of accents under the same language, where we develop and transfer the deep architecture in speaker-recognition area to accent classification task for learning utterance-level accent representation. Compared with the individual-level feature in speaker-recognition, accent recognition throws a more challenging issue in acquiring compact group-level features for the speakers with the same accent, hence a good discriminative accent feature space is desired… Our deep framework adopts multitask-learning mechanism and mainly consists […]

Read more

Image Inpainting with Contextual Reconstruction Loss

Convolutional neural networks (CNNs) have been observed to be inefficient in propagating information across distant spatial positions in images. Recent studies in image inpainting attempt to overcome this issue by explicitly searching reference regions throughout the entire image to fill the features from reference regions in the missing regions… This operation can be implemented as contextual attention layer (CA layer) cite{yu2018generative}, which has been widely used in many deep learning-based methods. However, it brings significant computational overhead as it computes […]

Read more

SurFree: a fast surrogate-free black-box attack

Machine learning classifiers are critically prone to evasion attacks. Adversarial examples are slightly modified inputs that are then misclassified, while remaining perceptively close to their originals… Last couple of years have witnessed a striking decrease in the amount of queries a black box attack submits to the target classifier, in order to forge adversarials. This particularly concerns the black-box score-based setup, where the attacker has access to top predicted probabilites: the amount of queries went from to millions of to […]

Read more
1 709 710 711 712 713 911