March 17, 2022 Transformer

Implementation of Trajectory Transformer with attention caching and batched beam search

This is reimplementation of Trajectory Transformer, introduced in Offline Reinforcement Learningas One Big Sequence Modeling Problem paper. The original implementation has few problems with inference speed, namely quadratic attention duringinference and sequential rollouts. The former slows down planning a lot, while the latter does notallow to do rollouts in parallel and utilize GPU to the full. Still, even after all changes, it is not that fast compared to traditional methods such as PPO or SAC/DDPG.However, the gains are huge, what […]

March 16, 2022 Transformer

ReMoS: Reducing Defect Inheritance in Transfer Learning via Relevant Model Slicing

This is the artifact for the ICSE 2022 paper “ReMoS: Reducing Defect Inheritance in Transfer Learning via Relevant Model Slicing”. Transfer learning is a popular software reuse technique in the deep learning community that enables developers to build custom models (students) based on sophisticated pretrained models (teachers). However, like vulnerability inheritance in traditional software reuse, some defects in the teacher model may also be inherited by students, such as well-known adversarial vulnerabilities and backdoors. Reducing such defects is challenging since […]

March 9, 2022 Transformer

MetaFormer : A Unified Meta Framework for Fine-Grained Recognition

A repository for the code used to create and train the model defined in “MetaFormer : A Unified Meta Framework for Fine-Grained Recognition” arxiv:2203.02751 Model zoo Usage python module install Pytorch and torchvision pip install torch==1.5.1 torchvision==0.6.1 git clone https://github.com/NVIDIA/apex cd apex pip install -v –disable-pip-version-check –no-cache-dir –global-option=”–cpp_ext” –global-option=”–cuda_ext” ./ install other requirements pip install opencv-python==4.5.1.48 yacs==0.1.8 data preparation Download inat21,18,17,CUB,NABirds,stanfordcars, andaircraft, put them in respective folders (/datasets/) and

March 4, 2022 Transformer

Official Python package for Deep Kernel Shaping (DKS) and Tailored Activation Transformations (TAT)

This Python package implements the activation function transformations andweight initializations used Deep Kernel Shaping (DKS) and Tailored ActivationTransformations (TAT). DKS and TAT, which were introduced in the DKS paper andTAT paper, are methods constructing/transforming neural networks to make themmuch easier to train. For example, these methods can be used in conjunction withK-FAC to train deep vanilla deep convnets (without skip connections ornormalization layers) as fast as standard ResNets of the same depth. The package supports the JAX, PyTorch, and TensorFlow […]

March 4, 2022 Transformer

Jittor implementation of Vision Transformer with Deformable Attention

This repository contains a simple implementation of Vision Transformer with Deformable Attention [arXiv]. Currently, we only release the code of models and the training scripts are under development including advance data augmentations and mixed precision training. Pytorch version: Github Dependencies NVIDIA GPU + CUDA 11.1 + cuDNN 8.0.3 Python 3.7 (Recommend to use Anaconda) jittor == 1.3.1.40 jimm TODO Training scripts with advance data augmentations. Citation If you find our work is useful in your research, please consider citing: @misc{xia2022vision, […]

March 4, 2022 Transformer

Transfer-Controlled Algorand Standard Asset

Reference implementation for a Transfer-controlled Algorand Standard Asset(TC-ASA), which extends an ASA to provide custom or more granular control aroundtransfer, mint and burn operations. It implements the following: Setting a “global freeze” flag, that prevents all token transfers. Individually whitelisting/locking users. Can be extended to implement any form of control over the transfer of the token,e.g.: Requiring the payment of royalties when transferring the token. Restricting the minimum or maximum balance per user, the overall number oftoken holders, etc. Implementing […]

February 19, 2022 Transformer

LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence

LayerNorm(SmallInit(Embedding)) in a Transformer I find that when training a transformer, the embedding matrix moves slowly, hence it’s difficult for the model to jump out of the initial noisy embedding. (initial embedding) [[-0.0073 0.0062 -0.0261 … 0.0086 0.0107 -0.008 ] … ] (after 1 step, the directions of the embedding vectors are not moved much because the numbers change by ~LR = ~4e-4) [[-0.0069 0.0066 -0.0265 … 0.009 0.0111 -0.0084] … ] So I propose initializing the embedding matrix to […]

February 16, 2022 Transformer

Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations

This repository contains PyTorch evaluation code, training code and pretrained EViT models for the ICLR 2022 Spotlight paper: Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations Youwei Liang, Chongjian Ge, Zhan Tong, Yibing Song, Jue Wang, Pengtao Xie The proposed EViT models obtain competitive tradeoffs in terms of speed / precision: If you use this code for a paper please cite: @inproceedings{liang2022evit, title={Not All Patches are What You

February 3, 2022 Transformer

Data for Datamodels: Predicting Predictions with Training Data

Here we provide the data used in the paper “Datamodels: Predicting Predictions with Training Data” (arXiv, Blog). Note that all of the data below is stored on Amazon S3 using the “requester pays” option to avoid a blowup in our data transfer costs (we put estimated AWS costs below)—if you are on a budget and do not mind waiting a bit longer, please contact us at [email protected] and we can try to arrange a free (but slower) transfer. Citation To […]

January 15, 2022 Transformer

official implementation of UniFormer

This repo is the official implementation of “Uniformer: Unified Transformer for Efficient Spatiotemporal Representation Learning”. It currently includes code and models for the following tasks: Updates 01/13/2022 [Initial commits]: Pretrained models on ImageNet-1K, Kinetics-400, Kinetics-600, Something-Something V1&V2 The supported code and models for image classification and video classification are provided. Introduction UniFormer (Unified transFormer) is introduce in arxiv, which effectively unifies 3D convolution and spatiotemporal self-attention in a concise transformer format. We adopt local MHRA in shallow layers to largely […]

« 1 2 3 4 … 7 »