Qdrant unveils hybrid vector algorithm for improved RAG

Open source vector database supplier Qdrant has developed its own BM42 search algorithm combining vector and standard BM25 keyword search methods to get better RAG results, claiming the method lowers cost.

A vector database stores encoded tokens (vector embeddings) representing parts of text phrases, audio, image, and video streams, which a large language model (LLM) searches when generating responses to users’ natural language requests. As the LLMs have usually been trained on a wide range of general knowledge documents, an organization’s proprietary information is made available to the LLM in retrieval-augmented generation (RAG) to help prevent responses with missing or false information in them. Such information is converted into vector embeddings so that it can be semantically searched.

Traditional search engines are based on looking for keywords. A BM25 algorithm is often the basis for this. It is a text-based ranking function that rates documents on how well they fit the search term (keyword/keywords).

Qdrant explanation of BM25 formula
Qdrant explanation of BM25 formula

Keyword search cannot distinguish between the possible different meanings of words that use the same spelling (homonyms) – such as bank, bow, and lead – whereas semantic search can. A hybrid search method, such as Qdrant’s BM42 algorithm, uses vectors of different kinds, and aims to combine the two approaches.

Andrey Vasnetsov

According to Qdrant CTO and co-founder Andrey Vasnetsov: “By moving away from keyword-based search to a fully vector-based approach, Qdrant sets a new industry standard. BM42, for short texts which are more prominent in RAG scenarios, provides the efficiency of traditional text search approaches, plus the context of vectors, so is more flexible, precise and efficient.”

Qdrant says its algorithm integrates sparse and dense vectors to accurately pinpoint relevant information within a document. A sparse vector handles exact term matching. Dense vectors handle semantic relevance and deep meaning.

Our understanding is that sparse vectors are characterized by having most of their elements as zero. Sparse vectors are selected to represent data in a high-dimensional space where only a small number of the dimensions (features) are non-zero. Keywords can be represented as sparse vectors in the form of term frequency or TF-IDF (Term Frequency-Inverse Document Frequency) vectors. Each keyword corresponds to a dimension in the vector space. 

Qdrant diagram showing sparse vector-like approach
Qdrant diagram showing sparse vector-like approach

With the BM25 algorithm, a search for documents containing the term “White House occupant” would return results ranked by how frequently and prominently the term appears in each listed document. A Qdrant BM42 algorithm would return results ranked this way and also include ranked documents or data points semantically related to “White House occupant” through sparse vector embedding entries in the Qdrant vector database.

Vasnetsov thinks vector search technology will become dominant. “While Qdrant envisions a future centered on vector-based search, this release helps to make vector search more universally applicable and marks a significant step toward the inevitable shift toward a vector-first approach.”

Qdrant reckons existing hybrid search approaches have limited scalability and accuracy or are “prohibitively expensive.” The company asserts its new hybrid search system works better, “providing an efficient, and cost-effective solution for both new and existing users.” We asked Vasnetsov how BM42 differed from Pinecone’s sparse and dense vector approach. He told us: “The difference lies in our sparse vectors and how they compare to alternatives. We can compute the IDF (Inverse Document Frequency) in real time, which makes the work with BM25 and BM42 embeddings efficient, doesn’t require pre-computation of statistics, and allows dynamic updates.”

A Qdrant blog provides background information about its ideas on hybrid search, and this article by Vasnetsov provides specific BM42 information.