Human-centric Spatio-Temporal Video Grounding With Visual Transformers

In this work, we introduce a novel task – Humancentric Spatio-Temporal Video Grounding (HC-STVG). Unlike the existing referring expression tasks in images or videos, by focusing on humans, HC-STVG aims to localize a spatiotemporal tube of the target person from an untrimmed video based on a given textural description… This task is useful, especially for healthcare and security-related applications, where the surveillance videos can be extremely long but only a specific person during a specific period of time is concerned. […]

Read more

Deep Multimodal Fusion by Channel Exchanging

Deep multimodal fusion by using multiple sources of data for classification or regression has exhibited a clear advantage over the unimodal counterpart on various applications. Yet, current methods including aggregation-based and alignment-based fusion are still inadequate in balancing the trade-off between inter-modal fusion and intra-modal processing, incurring a bottleneck of performance improvement… To this end, this paper proposes Channel-Exchanging-Network (CEN), a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities. Specifically, the channel exchanging process is […]

Read more

DoLFIn: Distributions over Latent Features for Interpretability

Interpreting the inner workings of neural models is a key step in ensuring the robustness and trustworthiness of the models, but work on neural network interpretability typically faces a trade-off: either the models are too constrained to be very useful, or the solutions found by the models are too complex to interpret. We propose a novel strategy for achieving interpretability that — in our experiments — avoids this trade-off… Our approach builds on the success of using probability as the […]

Read more

MotePy: A domain specific language for low-overhead machine learning and data processing

A domain specific language (DSL), named MotePy is presented. The DSL offers a high level syntax with low overheads for ML/data processing in time constrained or memory constrained systems… The DSL-to-C compiler has a novel static memory allocator that tracks object lifetimes and reuses the static memory, which we call the compiler-managed heap. (read more) PDF Abstract  

Read more

Estimating Risk-Adjusted Hospital Performance

The quality of healthcare provided by hospitals is subject to considerable variability. Consequently, accurate measurements of hospital performance are essential for various decision-makers, including patients, hospital managers and health insurers… Hospital performance is assessed via the health outcomes of their patients. However, as the risk profiles of patients between hospitals vary, measuring hospital performance requires adjustment for patient risk. This task is formalized in the state-of-the-art procedure through a hierarchical generalized linear model, that isolates hospital fixed-effects from the effect […]

Read more

A Transfer Learning Approach for Dialogue Act Classification of GitHub Issue Comments

Social coding platforms, such as GitHub, serve as laboratories for studying collaborative problem solving in open source software development; a key feature is their ability to support issue reporting which is used by teams to discuss tasks and ideas. Analyzing the dialogue between team members, as expressed in issue comments, can yield important insights about the performance of virtual teams… This paper presents a transfer learning approach for performing dialogue act classification on issue comments. Since no large labeled corpus […]

Read more

When Do You Need Billions of Words of Pretraining Data?

NLP is currently dominated by general-purpose pretrained language models like RoBERTa, which achieve strong performance on NLU tasks through pretraining on billions of words. But what exact knowledge or skills do Transformer LMs learn from large-scale pretraining that they cannot learn from less data?.. We adopt four probing methods—classifier probing, information-theoretic probing, unsupervised relative acceptability judgment, and fine-tuning on NLU tasks—and draw learning curves that track the growth of these different measures of linguistic ability with respect to pretraining data […]

Read more

On the State of Social Media Data for Mental Health Research

Data-driven methods for mental health treatment and surveillance have become a major focus in computational science research in the last decade. However, progress in the domain, in terms of both medical understanding and system performance, remains bounded by the availability of adequate data… Prior systematic reviews have not necessarily made it possible to measure the degree to which data-related challenges have affected research progress. In this paper, we offer an analysis specifically on the state of social media data that […]

Read more

Selectively Precoded Polar Codes

In this paper, we propose textit{selectively precoded polar (SPP) code}, built on top of Arikan’s capacity achieving polar codes. We provide the encoding and decoding scheme for SPP code… Simulation results show that for a target frame erasure rate (FER) of $10^{-5}$, a (128, 64) SPP code is just 0.23 dB away from the information theoretic limit at this blocklength. Further, it is also shown that such codes have very good distance properties compared to other contemporary polar code variants. […]

Read more

Synonym Knowledge Enhanced Reader for Chinese Idiom Reading Comprehension

Machine reading comprehension (MRC) is the task that asks a machine to answer questions based on a given context. For Chinese MRC, due to the non-literal and non-compositional semantic characteristics, Chinese idioms pose unique challenges for machines to understand… Previous studies tend to treat idioms separately without fully exploiting the relationship among them. In this paper, we first define the concept of literal meaning coverage to measure the consistency between semantics and literal meanings for Chinese idioms. With the definition, […]

Read more
1 726 727 728 729 730 914