RAG is a tactic, not a strategy

A vendor pitched me last month on a RAG system for an enterprise client. The pitch was technically competent. They had embeddings, vector search, reranking, a citation layer, an evaluation harness that looked at retrieval recall and answer faithfulness. What they did not have was an answer to a single question. Why is retrieval the right architecture for this problem.

That question matters because retrieval is not the only architecture available, and it is not always the right one. RAG has become a default the way microservices became a default ten years ago. Useful in the cases that fit. Damaging in the cases that do not.

What RAG is good at

Retrieval works when the problem has three properties. The body of knowledge changes faster than you can fine-tune. The relevant context for any given query is small enough to fit in a prompt. And the answer is genuinely a function of retrieved text rather than of reasoning over a global structure.

Customer support against an evolving product knowledge base fits beautifully. Compliance lookup against a regulatory corpus fits. Internal search across a wiki fits. These are not coincidences. They share the property that the model is being asked to find the right paragraph and rephrase it, not to construct a multi-step inference over the corpus as a whole.

What RAG is bad at

RAG falls down in three places I see repeatedly.

The first is anything where the answer requires aggregation. If a user asks how many of our customers in healthcare missed payment in Q3, the right architecture is not retrieval over invoice PDFs. It is a query against a structured store. RAG can produce a plausible answer to that question by retrieving an invoice and quoting from it, which is worse than no answer.

The second is anything where the relevant context is large. A contract review use case where the meaning of clause forty depends on the definitions in clauses one through ten and the integration clauses in fifty through sixty does not fit a chunk-and-retrieve pattern. The chunks lose the cross-reference structure that makes the document make sense.

The third is anything where the source documents are themselves unreliable. RAG with citation gives the user a feeling of confidence that the answer is grounded. If the underlying documents are out of date, internally contradictory, or written by hand without review, the citation is doing reputational work it cannot back up.

The strategic question RAG masks

The reason teams reach for RAG too quickly is that it lets them defer a harder question. What does the organisation actually know, where does it live, and how reliable is it. Retrieval is being used as a substitute for knowledge management. The vector store becomes the place where the company hopes its institutional knowledge has been encoded, and the LLM is supposed to surface it.

That works as a workaround for some cases. For most of the cases where a serious decision depends on the answer, it does not. You end up with a system that looks like it knows things and produces answers grounded in the worst version of the company's documents.

What I do instead

When I scope LLM work now, I split the question early. Is the problem about retrieval over a corpus, structured query against a warehouse, multi-step reasoning over a small structured input, or generation against a tightly defined schema. The answer dictates the architecture, and only one of those four answers leads to RAG.

For the structured query case, I push for a SQL-generation pattern backed by a semantic layer the model can read. For the multi-step case, I push for a small agent with tool use rather than retrieval. For the schema generation case, I push for structured outputs and validation rather than free-form generation. RAG remains the right answer for genuine retrieval-shaped problems, and only those.

RAG will continue to be the default for another year or two, because vendors have built sales motions around it and because it photographs well in a slide deck. That does not make it the right architecture. It makes it the easiest one to sell. There is a difference, and the leaders who learn to spot it will save their organisations a great deal of money.