Combining Embedding and Keyword Based Search for Improved Performance
TLDR — Ensembling keyword and embedding models for search is one of the quickest and easiest ways to improve search performance over the standard embedding based search paradigms. There is a large amount of evidence in the machine learning literature which supports that this helps with in domain performance, out of domain generalization, as well as multilingual transfer. The reason for this seems to be that sparse and dense representations of text seem to represent complimentary linguistic qualities of their underlying embeddings.