Guardrails for LLM systems in regulated industries

The word guardrails has become a marketing term. Vendors use it to describe everything from a content filter to a policy engine. Inside a regulated environment, that vagueness is dangerous, because the controls that survive a regulator visit are very specific, and they are mostly not the ones that vendors talk about first.

Three layers of control

I think of LLM guardrails as three concentric layers, each doing different work.

The innermost layer is the model layer. It includes the system prompt, fine-tuning, and any model-side filtering. This is the layer most vendors describe when they say guardrails. It is also the weakest of the three. Anything that lives only in the model layer can be defeated by adversarial inputs, by drift across model versions, and by the unavoidable fact that language models are statistical and will occasionally produce outputs they were told not to.

The middle layer is the application layer. It includes input validation, output validation against a schema, deterministic checks against business rules, retrieval boundaries, tool-use permissions, and structured logging. This is the layer where most of the production weight should sit. A model that is told to extract a field will sometimes produce the wrong field. A schema validator that rejects responses that do not conform is a cheap way to catch most of those failures before they leave the system.

The outer layer is the operational layer. It includes the human review queue, the audit log, the rollback procedure, the cost ceiling, the alerting on anomalous behaviour, and the documented escalation path. This is the layer that an auditor will spend the most time on, and the layer that most teams build last.

What regulators actually want to see

I have been through enough audits to know that regulators care about specific things. They care about whether you can answer the question, what did this system do on this day for this user, in a defensible way. They care about whether you have a documented process for handling a model that has produced a wrong output that affected a real decision. They care about whether you have a kill switch and a rollback that has been tested.

None of that is about the model. It is about the operational layer. Teams that have invested in the model layer at the expense of the other two are usually surprised by how badly the audit goes.

The schema validator earns its keep

The single most useful piece of code I have shipped in LLM systems is not a model and not a retrieval pipeline. It is a schema validator that sits between the model and the application and rejects any output that does not match a declared structure. The validator does three things at once. It catches model errors before they reach a user. It produces a log entry every time the model fails the schema, which becomes a quality signal. And it makes the contract between the model and the rest of the system explicit, which means engineers can reason about it.

Most of the production-grade LLM work I do now is built around this validator. The model is asked to produce structured output against a schema with strict types. If it fails, the system either retries with feedback or escalates. The result is a system whose failure modes are visible and bounded.

Cost as a guardrail

A control I rarely see discussed but that has saved my clients real money is a hard cost ceiling per request, per user, and per day. LLM systems can produce runaway costs through recursive tool use, retry loops, or a single user asking the system to summarise a corpus of its own outputs. The cost ceiling is not a financial control alone. It is a safety control. A system that cannot exceed a cost budget cannot spiral, and a system that cannot spiral is much easier to run.

Where to start

If you are running an LLM system inside a regulated environment and you are not yet confident in your guardrails, the order I recommend is to start with the schema validator, then the audit log, then the cost ceiling, then the human review queue, and finally the model-level controls. That order is roughly the opposite of where most vendor pitches start. It is the order that survives an audit.