LSTM and QRNN Language Model Toolkit for PyTorch
This repository contains the code used for two Salesforce Research papers:
The model comes with instructions to train:
-
word level language models over the Penn Treebank (PTB), WikiText-2 (WT2), and WikiText-103 (WT103) datasets
-
character level language models over the Penn Treebank (PTBC) and Hutter Prize dataset (enwik8)
The model can be composed of an LSTM or a Quasi-Recurrent Neural Network (QRNN) which is two or more times faster than the cuDNN LSTM in this setup while achieving equivalent or better accuracy.
- Install PyTorch 0.4
- Run
getdata.sh
to acquire the Penn Treebank and WikiText-2 datasets - Train the base model using
main.py
- (Optionally) Finetune the model using
finetune.py
- (Optionally) Apply the continuous cache pointer to