Vector space based Information Retrieval System for Text Processing – Information retrieval
Sequence of operations Install Requirements Add given wikipedia files to the corpus directory. Download glove.6B.100d.txt dataset (Ignore if already present) and place it in the project root directory. Run construct_index.py Run construct_index.py –zoned_index True Run trim_embeddings.py Run test_queries.py Run test_queries.py –score_title True Run test_queries.py –expand_query True Installing Requirements: pip install -r requirements.txt corpus Contains the files to be indexed. Add files directly to this directory. Do not create subdirectories.For this assignment, we have used the following files present in the […]
Read more