Residuals-based distributionally robust optimization with covariate information

We consider data-driven approaches that integrate a machine learning prediction model within distributionally robust optimization (DRO) given limited joint observations of uncertain parameters and covariates. Our framework is flexible in the sense that it can accommodate a variety of learning setups and DRO ambiguity sets… We investigate the asymptotic and finite sample properties of solutions obtained using Wasserstein, sample robust optimization, and phi-divergence-based ambiguity sets within our DRO formulations, and explore cross-validation approaches for sizing these ambiguity sets. Through numerical […]

Read more

Aligning Hyperbolic Representations: an Optimal Transport-based approach

Hyperbolic-spaces are better suited to represent data with underlying hierarchical relationships, e.g., tree-like data. However, it is often necessary to incorporate, through alignment, different but related representations meaningfully… This aligning is an important class of machine learning problems, with applications as ontology matching and cross-lingual alignment. Optimal transport (OT)-based approaches are a natural choice to tackle the alignment problem as they aim to find a transformation of the source dataset to match a target dataset, subject to some distribution constraints. […]

Read more

Simple NLP in Python with TextBlob: N-Grams Detection

Introduction The constant growth of data on the Internet creates a demand for a tool that could process textual information in a faster way with no effort from the ordinary user. Moreover, it’s highly important that this instrument of text analysis could implement solutions for both low and high-level NLP tasks such as counting word frequencies, calculating sentiment analysis of the texts or detecting patterns in relationships between words. TextBlob is a great lightweight library for a wide variety of […]

Read more

Seaborn Bar Plot – Tutorial and Examples

Introduction Seaborn is one of the most widely used data visualization libraries in Python, as an extension to Matplotlib. It offers a simple, intuitive, yet highly customizable API for data visualization. In this tutorial, we’ll take a look at how to plot a Bar Plot in Seaborn. Bar graphs display numerical quantities on one axis and categorical variables on the other, letting you see how many occurrences there are for the different categories. Bar charts can be used for visualizing […]

Read more

Reading and Writing XML Files in Python with Pandas

Introduction XML (Extensible Markup Language) is a markup language used to store structured data. The Pandas data analysis library provides functions to read/write data for most of the file types. For example, it includes read_csv() and to_csv() for interacting with CSV files. However, Pandas does not include any methods to read and write XML files. In this article, we will take a look at how we can use other modules to read data from an XML file, and load it […]

Read more

Issue #110 – Better Out of Vocabulary Translation with Bilingual Terminology Mining

03 Dec20 Issue #110 – Better Out of Vocabulary Translation with Bilingual Terminology Mining Author: Akshai Ramesh, Machine Translation Scientist @ Iconic Introduction A significant weakness in conventional neural machine translation (NMT) systems is their inability to correctly translate Out of Vocabulary (OOV) words: end-to-end NMTs tend to have relatively small vocabularies due to memory limitations with a single “unknown token” (usually abbreviated in MT slang as “unk”) that represents every possible out-of-vocabulary (OOV) word. In NMT, byte-pair encoding can […]

Read more

Improved Variational Bayesian Phylogenetic Inference with Normalizing Flows

Variational Bayesian phylogenetic inference (VBPI) provides a promising general variational framework for efficient estimation of phylogenetic posteriors. However, the current diagonal Lognormal branch length approximation would significantly restrict the quality of the approximating distributions… In this paper, we propose a new type of VBPI, VBPI-NF, as a first step to empower phylogenetic posterior estimation with deep learning techniques. By handling the non-Euclidean branch length space of phylogenetic models with carefully designed permutation equivariant transformations, VBPI-NF uses normalizing flows to provide […]

Read more

GCOMB: Learning Budget-constrained Combinatorial Algorithms over Billion-sized Graphs

There has been an increased interest in discovering heuristics for combinatorial problems on graphs through machine learning. While existing techniques have primarily focused on obtaining high-quality solutions, scalability to billion-sized graphs has not been adequately addressed… In addition, the impact of a budget-constraint, which is necessary for many practical scenarios, remains to be studied. In this paper, we propose a framework called GCOMB to bridge these gaps. GCOMB trains a Graph Convolutional Network (GCN) using a novel probabilistic greedy mechanism […]

Read more

H-Mem: Harnessing synaptic plasticity with Hebbian Memory Networks

The ability to base current computations on memories from the past is critical for many cognitive tasks such as story understanding. Hebbian-type synaptic plasticity is believed to underlie the retention of memories over medium and long time scales in the brain… However, it is unclear how such plasticity processes are integrated with computations in cortical networks. Here, we propose Hebbian Memory Networks (H-Mems), a simple neural network model that is built around a core hetero-associative network subject to Hebbian plasticity. […]

Read more

Learning Semantic-aware Normalization for Generative Adversarial Networks

The recent advances in image generation have been achieved by style-based image generators. Such approaches learn to disentangle latent factors in different image scales and encode latent factors as “style” to control image synthesis… However, existing approaches cannot further disentangle fine-grained semantics from each other, which are often conveyed from feature channels. In this paper, we propose a novel image synthesis approach by learning Semantic-aware relative importance for feature channels in Generative Adversarial Networks (SariGAN). Such a model disentangles latent […]

Read more
1 702 703 704 705 706 911