August 5, 2021 Deep Learning

A modular framework for vision & language multimodal research

MMF MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-the-art vision and language models and has powered multiple research projects at Facebook AI Research. See full list of project inside or built on MMF here. MMF is powered by PyTorch, allows distributed training and is un-opinionated, scalable and fast. Use MMF to bootstrap for your next vision and language multimodal research project by following the installation instructions. Take […]

August 5, 2021 Deep Learning

Sequence to Sequence Framework in PyTorch

nmtpytorch Sequence to Sequence Framework in PyTorch This project is not actively maintained so issues created are unlikely to be addressed in a timely way. If you are interested, there’s a recent fork of this repository called pysimt which includes Transformer-based architectures as well. nmtpytorch allows training of various end-to-end neural architectures includingbut not limited to neural machine translation, image captioning and automaticspeech recognition systems. The initial codebase was in Theano and wasinspired from the famous dl4mt-tutorialcodebase. nmtpytorch received valuable […]

August 5, 2021 Jupyter notebooks

An implementation of WaveNet with fast generation

pytorch-wavenet This is an implementation of the WaveNet architecture, as described in the original paper. pytorch-wavenet This is an implementation of the WaveNet architecture, as described in the original paper. Features Automatic creation of a dataset (training and validation/test set) from all sound files (.wav, .aiff, .mp3) in a directory Efficient multithreaded data loading Logging to TensorBoard (Training loss, validation loss, validation accuracy, parameter and gradient histograms, generated samples) Fast generation, as introduced here Requirements python 3 pytorch 0.3 numpy […]

August 5, 2021 PyTorch

Pytorch implementation of Tacotron

Tacotron-pytorch A pytorch implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model. Data I used LJSpeech dataset which consists of pairs of text script and wav files. The complete dataset (13,100 pairs) can be downloaded here. I referred https://github.com/keithito/tacotron for the preprocessing code. File description hyperparams.py includes all hyper parameters that are needed. data.py loads training data and preprocess text to index and wav files to spectrogram. Preprocessing codes for text is in text/ directory. module.py contains all methods, including […]

August 5, 2021 Deep Learning

A deep learning nlp library inspired by the fast.ai library

Quick NLP Quick NLP is a deep learning nlp library inspired by the fast.ai library It follows the same api as fastai and extends it allowing for quick and easy running of nlp models Features Python 3.6 code Tight-knit integration with Fast.ai library: Fast.ai style DataLoader objects for sentence to sentence algorithms Fast.ai style DataLoader objects for dialogue algorithms Fast.ai style DataModel objects for training nlp models Can run a seq2seq model with a few lines of code similar to […]

August 5, 2021 Audio

Neural speaker diarization with pyannote-audio

Neural speaker diarization with pyannote-audio Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding pyannote.audio is an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines: pyannote.audio also comes with pretrained models covering a wide range of domains for voice activity detection, speaker change detection, […]

August 5, 2021 Task

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning Sandeep Subramanian, Adam Trischler, Yoshua Bengio & Christopher Pal ICLR 2018 About GenSen is a technique to learn general purpose, fixed-length representations of sentences via multi-task training. These representations are useful for transfer and low-resource learning. For details please refer to our ICLR paper. Code We provide a PyTorch implementation of our paper along with pre-trained models as well as code to evaluate these models on a variety of […]

August 5, 2021 Speech Recognitio

ESPnet: end-to-end speech processing toolkit

ESPnet ESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition and end-to-end text-to-speech. ESPnet uses chainer and pytorch as a main deep learning engine, and also follows Kaldi style data processing, feature extraction/format, and recipes to provide a complete setup for speech recognition and other speech processing experiments. Key Features Kaldi style complete recipe Support numbers of ASR recipes (WSJ, Switchboard, CHiME-4/5, Librispeech, TED, CSJ, AMI, HKUST, Voxforge, REVERB, etc.) Support numbers of TTS recipes with […]

August 5, 2021 Tool

A toolkit for validating, forging, scanning and tampering JWTs

jwt_tool.py is a toolkit for validating, forging, scanning and tampering JWTs (JSON Web Tokens). Its functionality includes: Checking the validity of a token Testing for known exploits: (CVE-2015-2951) The alg=none signature-bypass vulnerability (CVE-2016-10555) The RS/HS256 public key mismatch vulnerability (CVE-2018-0114) Key injection vulnerability (CVE-2019-20933/CVE-2020-28637) Blank password vulnerability (CVE-2020-28042) Null signature vulnerability Scanning for misconfigurations or known weaknesses Fuzzing claim values to provoke unexpected behaviours Testing the validity of a secret/key file/Public Key/JWKS key Identifying weak keys via a High-speed Dictionary […]

August 5, 2021 Beginner, Machine Learning, NLP, Project, Python, Text

Identifying The Language of A Document Using NLP!

This article was published as a part of the Data Science Blogathon Introduction The goal of this article is to identify the language from the written text. The text in documents is available in many languages and when we don’t know the language it becomes very difficult sometimes to tell this to google translator as well. For most translators, we have to tell both the input language and the desired language. If you had a text written in Spanish and you […]

« 1 … 530 531 532 533 534 … 928 »