A fast and easy implementation of Transformer with PyTorch
FasySeq
FasySeq is a shorthand as a Fast and easy sequential modeling toolkit. It aims to provide a seq2seq model to researchers and developers, which can be trained efficiently and modified easily. This toolkit is based on Transformer(Vaswani et al.), and will add more seq2seq models in the future.
Dependency
PyTorch >= 1.4
NLTK
Result
…
Structure
…
To Be Updated
- top-k and top-p sampling
- multi-GPU inference
- length penalty in beam search
- …
Preprocess
Build Vocabulary
createVocab.py
NamedArguments | Description |
---|---|
-f/–file | The files used to build the vocabulary.Type: List |
–vocab_num | The maximum size of vocabulary, the excess word will be discard according to the frequency.Type: Int Default: -1 |
–min_freq | The minimum frequency of token in vocabulary.
|