A sentence embeddings method that provides semantic representations
InferSent
InferSent is a sentence embeddings method that provides semantic representations for English sentences. It is trained on natural language inference data and generalizes well to many different tasks.
We provide our pre-trained English sentence encoder from our paper and our SentEval evaluation toolkit.
Recent changes: Removed train_nli.py and only kept pretrained models for simplicity. Reason is I do not have time anymore to maintain the repo beyond simple scripts to get sentence embeddings.
Dependencies
This code is written in python. Dependencies include:
- Python 2/3
- Pytorch (recent version)
- NLTK >= 3
Download word vectors
Download GloVe (V1) or fastText (V2) vectors:
mkdir GloVe
curl -Lo GloVe/glove.840B.300d.zip http://nlp.stanford.edu/data/glove.840B.300d.zip
unzip GloVe/glove.840B.300d.zip -d GloVe/
mkdir fastText
curl -Lo fastText/crawl-300d-2M.vec.zip https://dl.fbaipublicfiles.com/fasttext/vectors-english/crawl-300d-2M.vec.zip
unzip fastText/crawl-300d-2M.vec.zip -d fastText/