Articles About Machine Learning

MotePy: A domain specific language for low-overhead machine learning and data processing

A domain specific language (DSL), named MotePy is presented. The DSL offers a high level syntax with low overheads for ML/data processing in time constrained or memory constrained systems… The DSL-to-C compiler has a novel static memory allocator that tracks object lifetimes and reuses the static memory, which we call the compiler-managed heap. (read more) PDF Abstract  

Read more

Estimating Risk-Adjusted Hospital Performance

The quality of healthcare provided by hospitals is subject to considerable variability. Consequently, accurate measurements of hospital performance are essential for various decision-makers, including patients, hospital managers and health insurers… Hospital performance is assessed via the health outcomes of their patients. However, as the risk profiles of patients between hospitals vary, measuring hospital performance requires adjustment for patient risk. This task is formalized in the state-of-the-art procedure through a hierarchical generalized linear model, that isolates hospital fixed-effects from the effect […]

Read more

A Transfer Learning Approach for Dialogue Act Classification of GitHub Issue Comments

Social coding platforms, such as GitHub, serve as laboratories for studying collaborative problem solving in open source software development; a key feature is their ability to support issue reporting which is used by teams to discuss tasks and ideas. Analyzing the dialogue between team members, as expressed in issue comments, can yield important insights about the performance of virtual teams… This paper presents a transfer learning approach for performing dialogue act classification on issue comments. Since no large labeled corpus […]

Read more

When Do You Need Billions of Words of Pretraining Data?

NLP is currently dominated by general-purpose pretrained language models like RoBERTa, which achieve strong performance on NLU tasks through pretraining on billions of words. But what exact knowledge or skills do Transformer LMs learn from large-scale pretraining that they cannot learn from less data?.. We adopt four probing methods—classifier probing, information-theoretic probing, unsupervised relative acceptability judgment, and fine-tuning on NLU tasks—and draw learning curves that track the growth of these different measures of linguistic ability with respect to pretraining data […]

Read more

On the State of Social Media Data for Mental Health Research

Data-driven methods for mental health treatment and surveillance have become a major focus in computational science research in the last decade. However, progress in the domain, in terms of both medical understanding and system performance, remains bounded by the availability of adequate data… Prior systematic reviews have not necessarily made it possible to measure the degree to which data-related challenges have affected research progress. In this paper, we offer an analysis specifically on the state of social media data that […]

Read more

Selectively Precoded Polar Codes

In this paper, we propose textit{selectively precoded polar (SPP) code}, built on top of Arikan’s capacity achieving polar codes. We provide the encoding and decoding scheme for SPP code… Simulation results show that for a target frame erasure rate (FER) of $10^{-5}$, a (128, 64) SPP code is just 0.23 dB away from the information theoretic limit at this blocklength. Further, it is also shown that such codes have very good distance properties compared to other contemporary polar code variants. […]

Read more

Synonym Knowledge Enhanced Reader for Chinese Idiom Reading Comprehension

Machine reading comprehension (MRC) is the task that asks a machine to answer questions based on a given context. For Chinese MRC, due to the non-literal and non-compositional semantic characteristics, Chinese idioms pose unique challenges for machines to understand… Previous studies tend to treat idioms separately without fully exploiting the relationship among them. In this paper, we first define the concept of literal meaning coverage to measure the consistency between semantics and literal meanings for Chinese idioms. With the definition, […]

Read more

Geometric Deep Reinforcement Learning for Dynamic DAG Scheduling

In practice, it is quite common to face combinatorial optimization problems which contain uncertainty along with non-determinism and dynamicity. These three properties call for appropriate algorithms; reinforcement learning (RL) is dealing with them in a very natural way… Today, despite some efforts, most real-life combinatorial optimization problems remain out of the reach of reinforcement learning algorithms. In this paper, we propose a reinforcement learning approach to solve a realistic scheduling problem, and apply it to an algorithm commonly executed in […]

Read more

EfficientPose — An efficient, accurate and scalable end-to-end 6D multi object pose estimation approach

In this paper we introduce EfficientPose, a new approach for 6D object pose estimation. Our method is highly accurate, efficient and scalable over a wide range of computational resources… Moreover, it can detect the 2D bounding box of multiple objects and instances as well as estimate their full 6D poses in a single shot. This eliminates the significant increase in runtime when dealing with multiple objects other approaches suffer from. These approaches aim to first detect 2D targets, e.g. keypoints, […]

Read more

Scaling Hidden Markov Language Models

The hidden Markov model (HMM) is a fundamental tool for sequence modeling that cleanly separates the hidden state from the emission structure. However, this separation makes it difficult to fit HMMs to large datasets in modern NLP, and they have fallen out of use due to very poor performance compared to fully observed models… This work revisits the challenge of scaling HMMs to language modeling datasets, taking ideas from recent approaches to neural modeling. We propose methods for scaling HMMs […]

Read more
1 98 99 100 101 102 226