Qdrant promises 10x faster indexing with GPU-powered vector database

Open source vector database supplier Qdrant says it can use GPUs to calculate vector indices ten times faster than x86 CPUs.

A vector database holds vector embeddings, encoded tokens mathematically calculated from segments of text phrases, audio, image, and video streams, which large language models (LLMs) search when generating responses to users’ natural language requests. The search looks for items that are close to the search item in a vector space. There must be an index of vector embeddings for the search to take place. Building this index can become computationally intensive as the item count scales into the billions and beyond. Qdrant has enabled AMD, Intel, and Nvidia GPUs to be used to build such indices on its latest v1.13 software release.

Andrey Vasnetsov, Qdrant
Andrey Vasnetsov

Qdrant CTO and co-founder Andrey Vasnetsov stated: “Index building is often a bottleneck for scaling vector search applications. By introducing platform-independent GPU acceleration, we’ve made it faster and more cost-effective to build indices for billions of vectors while giving users the flexibility to choose the hardware that best suits their needs.”

The company bases its indexing technology on HNSW (Hierarchical Navigable Small World), an algorithm using a graph-based approximate nearest neighbor search technique used in many vector databases. A blog by David Myriel, Qdrant’s Director of Developer Relations, states that Qdrant developed this software in-house rather than using third-party code.

He says: “Qdrant doesn’t require high-end GPUs to achieve significant performance improvements,” and supplies a table showing indexing times and costs with and without using various common GPUs:

Qdrant figures
Quoted prices are from Google Cloud Platform. We don’t know the configuration of the without-GPU servers

Below is a chart visualizing the table’s two timing columns for clearer comparison:

Qdrant vector indexing times

The v1.13 release also includes:

  • Strict Mode to limit computationally intensive operations like unindexed filtering, batch sizes, and certain search parameters. It helps make multi-tenancy work better.
  • HNSW Graph Compression to reduce storage use via HNSW Delta Encoding by storing only the differences (or “deltas”) between values.
  • Named Vector Filtering for when you store multiple vectors of different sizes and types in a single data point. The blog says: “This makes it easy to search for points based on the presence of specific vectors. For example, if your collection includes image and text vectors, you can filter for points that only have the image vector defined.”
  • Custom Storage – using a custom storage backend instead of RocksDB to prevent random latency-increasing compaction spikes, ensuring consistent performance by requiring a constant number of disk operations for reads and writes, regardless of data size.

Qdrant says this release creates new possibilities for AI-powered applications – such as live search, personalized recommendations, and AI agents – that demand real-time responsiveness, frequent reindexing, and the ability to make immediate decisions on dynamic data streams.

There have been more than 10 million installs of its vector database.