Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion

LiDAR point cloud analysis is a core task for 3D computer vision, especially for autonomous driving. However, due to the severe sparsity and noise interference in the single sweep LiDAR point cloud, the accurate semantic segmentation is non-trivial to achieve… In this paper, we propose a novel sparse LiDAR point cloud semantic segmentation framework assisted by learned contextual shape priors. In practice, an initial semantic segmentation (SS) of a single sweep point cloud can be achieved by any appealing network […]

Read more

End-to-End Object Detection with Fully Convolutional Network

Mainstream object detectors based on the fully convolutional network has achieved impressive performance. While most of them still need a hand-designed non-maximum suppression (NMS) post-processing, which impedes fully end-to-end training… In this paper, we give the analysis of discarding NMS, where the results reveal that a proper label assignment plays a crucial role. To this end, for fully convolutional detectors, we introduce a Prediction-aware One-To-One (POTO) label assignment for classification to enable end-to-end detection, which obtains comparable performance with NMS. […]

Read more

Backpropagating Linearly Improves Transferability of Adversarial Examples

The vulnerability of deep neural networks (DNNs) to adversarial examples has drawn great attention from the community. In this paper, we study the transferability of such examples, which lays the foundation of many black-box attacks on DNNs… We revisit a not so new but definitely noteworthy hypothesis of Goodfellow et al.’s and disclose that the transferability can be enhanced by improving the linearity of DNNs in an appropriate manner. We introduce linear backpropagation (LinBP), a method that performs backpropagation in […]

Read more

Fine-Grained Dynamic Head for Object Detection

The Feature Pyramid Network (FPN) presents a remarkable approach to alleviate the scale variance in object representation by performing instance-level assignments. Nevertheless, this strategy ignores the distinct characteristics of different sub-regions in an instance… To this end, we propose a fine-grained dynamic head to conditionally select a pixel-level combination of FPN features from different scales for each instance, which further releases the ability of multi-scale feature representation. Moreover, we design a spatial gate with the new activation function to reduce […]

Read more

Attention-based Saliency Hashing for Ophthalmic Image Retrieval

Deep hashing methods have been proved to be effective for the large-scale medical image search assisting reference-based diagnosis for clinicians. However, when the salient region plays a maximal discriminative role in ophthalmic image, existing deep hashing methods do not fully exploit the learning ability of the deep network to capture the features of salient regions pointedly… The different grades or classes of ophthalmic images may be share similar overall performance but have subtle differences that can be differentiated by mining […]

Read more

MFST: A Python OpenFST Wrapper With Support for Custom Semirings and Jupyter Notebooks

This paper introduces mFST, a new Python library for working with Finite-State Machines based on OpenFST. mFST is a thin wrapper for OpenFST and exposes all of OpenFST’s methods for manipulating FSTs… Additionally, mFST is the only Python wrapper for OpenFST that exposes OpenFST’s ability to define a custom semirings. This makes mFST ideal for developing models that involve learning the weights on a FST or creating neuralized FSTs. mFST has been designed to be easy to get started with […]

Read more

Probabilistic Latent Factor Model for Collaborative Filtering with Bayesian Inference

Latent Factor Model (LFM) is one of the most successful methods for Collaborative filtering (CF) in the recommendation system, in which both users and items are projected into a joint latent factor space. Base on matrix factorization applied usually in pattern recognition, LFM models user-item interactions as inner products of factor vectors of user and item in that space and can be efficiently solved by least square methods with optimal estimation… However, such optimal estimation methods are prone to overfitting […]

Read more

QANom: Question-Answer driven SRL for Nominalizations

We propose a new semantic scheme for capturing predicate-argument relations for nominalizations, termed QANom. This scheme extends the QA-SRL formalism (He et al., 2015), modeling the relations between nominalizations and their arguments via natural language question-answer pairs… We construct the first QANom dataset using controlled crowdsourcing, analyze its quality and compare it to expertly annotated nominal-SRL annotations, as well as to other QA-driven annotations. In addition, we train a baseline QANom parser for identifying nominalizations and labeling their arguments with […]

Read more

Is It a Plausible Colour? UCapsNet for Image Colourisation

Human beings can imagine the colours of a grayscale image with no particular effort thanks to their ability of semantic feature extraction. Can an autonomous system achieve that?.. Can it hallucinate plausible and vibrant colours? This is the colourisation problem. Different from existing works relying on convolutional neural network models pre-trained with supervision, we cast such colourisation problem as a self-supervised learning task. We tackle the problem with the introduction of a novel architecture based on Capsules trained following the […]

Read more

Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language

We address the problem of retrieving a specific moment from an untrimmed video by natural language. It is a challenging problem because a target moment may take place in the context of other temporal moments in the untrimmed video… Existing methods cannot tackle this challenge well since they do not fully consider the temporal contexts between temporal moments. In this paper, we model the temporal context between video moments by a set of predefined two-dimensional maps under different temporal scales. […]

Read more
1 698 699 700 701 702 911