Why I stopped reaching for vector databases by default

For the better part of two years, my reflexive answer when a client asked about retrieval was to set up a vector database. That reflex has changed. The reasons are worth writing down, because I see the same default in a lot of teams, and I do not think it is serving them.

Where vector search actually wins

Vector search wins when the query and the document share semantic content but not lexical content. A user types "how do I cancel my subscription" and the relevant document says "unsubscribing from a recurring billing plan." The terms do not overlap. Vector embeddings can recover the relationship. Lexical search cannot.

This is the case where vector search does work that nothing else does. It is also a smaller fraction of real retrieval workloads than the marketing suggests.

Where keyword search is fine

A surprising number of retrieval problems are well served by BM25 or even by simpler keyword indexes. The cases where this holds.

The query and the document are written in the same vocabulary. This is most internal documentation. Engineers searching code docs use the same terms the docs use. Customer support agents searching policy documents use the same terms the policies use. The semantic gap is small.

The query is a short specific term, like a SKU, an error code, or a person's name. Vector search is bad at exact match. Keyword search is good at it. For these queries, BM25 is faster and more accurate.

The corpus is small. Below a few thousand documents, the recall benefit of vector search is often dominated by the recall cost of fuzzy matching. A well-tuned BM25 over a small corpus is hard to beat.

Where hybrid wins

The strongest production setups I have shipped recently are hybrid. BM25 and vector search run in parallel. The results are merged with a learned or hand-tuned reranker.

The hybrid pattern captures the strengths of both. Keyword catches the queries where exact terms matter. Vector catches the queries where semantic content matters. The reranker sorts the merged candidate list against a quality criterion.

The infrastructure cost of running both is real but bounded. The implementation cost is moderate. The retrieval quality gain is meaningful, especially in heterogeneous corpora where the query distribution is mixed.

Why I stopped reaching for vector first

A few specific lessons from deployments.

Vector search is sensitive to the embedding model. Changing the model means re-embedding the corpus. For a large corpus, this is a real operational cost. Keyword search is invariant to model changes. The system is more robust over time.

Vector search hides quality problems. A nearest-neighbour search will always return something. Whether that something is relevant is harder to evaluate. Teams ship vector retrieval with weak evaluation and discover the quality problems in production. Keyword search fails more visibly. The failure mode is no results, which is easier to detect and act on.

Vector search has hidden costs at scale. The infrastructure for fast nearest-neighbour search at scale is non-trivial. Index size is a real constraint. Ingest latency is a real constraint. Keyword search at scale is a solved problem with mature tooling.

Vector search benefits compound with reranking and metadata filtering. A naked vector search is rarely the best configuration. The good systems pre-filter on metadata, run vector search on the filtered set, and rerank with a stronger model. By the time you have built that pipeline, the marginal contribution of the vector step is one stage among several.

The decision framework I now use

For any new retrieval problem, I now go through this sequence.

What does the query distribution look like. If queries are mostly short specific terms, BM25 first.

What does the corpus look like. If documents are written in the same vocabulary as the queries, BM25 first.

What is the size. Below a few thousand documents, BM25 is almost always sufficient. Above a few hundred thousand, hybrid usually pays off.

What is the operational maturity of the team. If the team is small and operationally constrained, BM25 is one less moving part. If the team can run a vector index well, hybrid is the strong default.

The point of the framework is that vector search is a tool, not a default. The default should be the simplest pattern that meets the recall and quality requirements. For a lot of retrieval problems, that is not vector.