Global Filter Networks for Image Classification

GFNet Created by Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for GFNet. Global Filter Networks is a transformer-style architecture that learns long-term spatial dependencies in the frequency domain with log-linear complexity. Our architecture replaces the self-attention layer in vision transformers with three key operations: a 2D discrete Fourier transform, an element-wise multiplication between frequency-domain features and learnable global filters, and a 2D inverse Fourier transform. Global Filter Layer GFNet is a conceptually […]

Read more

Drone detection using YOLOv5 in python

Detect_Drone Drone detection using YOLOv5 in python. Install Python >= 3.6.0 required with all requirements.txt dependencies installed: $ git clone https://github.com/tusharsarkar3/Detect_Drone.git $ pip install -r requirements.txt Training The structure of the file system is of great importance here so these images will show you the correct way of organizing it. The main folder named datasets should be on the same level as this repository. The next steps are elaborated in the images: The two folders with images and labels respectively […]

Read more

A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API

A collection of metrics for evaluating timbre dissimilarity using the TorchMetrics API Installation pip install -e . Usage import timbremetrics datasets = timbremetrics.list_datasets() dataset = datasets[0] # get the first timbre dataset # MAE between target dataset and pred embedding distances metric = timbremetrics.TimbreMAE( margin=0.0, dataset=dataset, distance=timbremetrics.l1 ) # get numpy audio for the timbre dataset audio = timbremetrics.get_audio(dataset) # get arbitrary embeddings for the timbre dataset’s audio embeddings = net(audio) # compute the metric metric(embeddings) Metrics The following metrics […]

Read more

Look before you leap: learning landmark features for one-stage visual grounding

LBYL-Net This repo implements paper Look Before You Leap: Learning Landmark Features For One-Stage Visual Grounding CVPR 2021. Getting Started Prerequisites python 3.7 pytorch 10.0 cuda 10.0 gcc 4.92 or above Installation Then clone the repo and install dependencies. git clone https://github.com/svip-lab/LBYLNet.git cd LBYLNet pip install requirements.txt You also need to install our landmark feature convolution: cd ext git clone https://github.com/hbb1/landmarkconv.git cd landmarkconv/lib/layers python setup.py install –user We follow dataset structure DMS and FAOA. For convience, we have pack them […]

Read more

A Novel Topology-Preserving Loss Function for Tubular Structure Segmentation

clDice CVPR 2021 Authors: Suprosanna Shit and Johannes C. Paetzold et al. @article{shit2020cldice, title={clDice – a Topology-Preserving Loss Function for Tubular Structure Segmentation}, author={Shit, Suprosanna and Paetzold, Johannes C and Sekuboyina, Anjany and Zhylka, Andrey and Ezhov, Ivan and Unger, Alexander and Pluim, Josien PW and Tetteh, Giles and Menze, Bjoern H}, journal={arXiv preprint arXiv:2003.07311}, year={2020} } Abstract Accurate segmentation of tubular, network-like structures, such as vessels, neurons, or roads, is relevant to many fields of research. For such structures, […]

Read more

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

StyleSpace-pytorch Implementation of StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation (https://arxiv.org/pdf/2011.12799.pdf) in PyTorch This implementation is mostly relied on rosinality’s stylegan2-pytorch Requirements I have tested on: Usage For the index and channel, please check the paper (https://arxiv.org/pdf/2011.12799.pdf), e.g., (11_286), channel 286 of generator level 11. FFHQ Firstly, you should download pretrained model from here and place the stylegan2-ffhq-config-f.pkl into pretrained folder. Open the notebook StyleSpace_FFHQ.ipynb Car LSUN GitHub https://github.com/xrenaa/StyleSpace-pytorch    

Read more

Lidar sensors are frequently used in environment perception for autonomous vehicles

PointCloudDeNoising Point Cloud Denoising Abstract Lidar sensors are frequently used in environment perception for autonomous vehicles and mobile robotics to complement camera, radar, and ultrasonic sensors. Adverse weather conditions are significantly impacting the performance of lidar-based scene understanding by causing undesired measurement points that in turn effect missing detections and false positives.In heavy rain or dense fog, water drops could be misinterpreted as objects in front of the vehicle which brings a mobile robot to a full stop.In this paper, […]

Read more

An exact meshing solution from neural networks

AnalyticMesh Analytic Marching is an exact meshing solution from neural networks. Compared to standard methods, it completely avoids geometric and topological errors that result from insufficient sampling, by means of mathematically guaranteed analysis. This repository gives an implementation of Analytic Marching algorithm. This algorithm is initially proposed in our conference paper Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks, then finally improved in our journal paper: Learning and Meshing from Deep Implicit Surface Networks Using an Efficient […]

Read more

Content-Style Modulation for Image Retrieval with Text Feedback

CoSMo.pytorch Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback, Seungmin Lee*, Dongwan Kim*, Bohyung Han. *(denotes equal contribution) Setup Python: python3.7 Install required packages Install torch and torchvision via following command (CUDA10) pip install torch==1.2.0 torchvision==0.4.0 -f https://download.pytorch.org/whl/torch_stable.html Install other packages pip install -r requirements.txt Dataset Download the FashionIQ dataset by following the instructions on this link. We have set the default path for FashionIQ datasets in data/fashionIQ.py as _DEFAULT_FASHION_IQ_DATASET_ROOT = ‘/data/image_retrieval/fashionIQ’. You can change this […]

Read more

Source-filter based Decomposed Modeling for Speech Synthesis

FastPitchFormant – PyTorch Implementation PyTorch Implementation of FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis. Dependencies You can install the Python dependencies with pip3 install -r requirements.txt Inference You have to download the pretrained models and put them in output/ckpt/LJSpeech/. For English single-speaker TTS, run python3 synthesize.py –text “YOUR_DESIRED_TEXT” –restore_step 1000000 –mode single -p config/LJSpeech/preprocess.yaml -m config/LJSpeech/model.yaml -t config/LJSpeech/train.yaml The generated utterances will be put in output/result/. Batch Inference Batch inference is also supported, try python3 synthesize.py –source preprocessed_data/LJSpeech/val.txt –restore_step […]

Read more
1 585 586 587 588 589 928