Efficient Extractive Question Answering on CPU using QUIP
![](https://www.deeplearningdaily.com/wp-content/uploads/2022/10/efficient-extractive-question-answering-on-cpu-using-quip_635851fa51b30-375x210.png)
TLDR — Extractive question answering is an important task for providing a good user experience in many applications. The popular Retriever-Reader framework for QA using BERT can be difficult to scale as it requires the re-processing of candidate documents in the context of a question in real time. By using phrase embeddings, we can process question and context independently which drastically reduces runtime demands. On a limited experiment I found QUIP to be 4x faster than a comparable QA model on