A verification loop is the practice of treating an AI agent's output as unproven until it passes a repeatable verification step, then feeding any failure back into another attempt, so reliability comes from an iterated check-and-correct cycle rather than a single one-time review.

How it works

An agent gathers context, takes an action, and then verifies the result before it is treated as done, repeating when the verification fails. The verification can be deterministic, a test suite, a type check, a rule-based gate, or judgment-based, a second model scoring the output against a rubric or a human review, and the two differ in what they catch and what they cost. What makes it a loop rather than a gate is the feedback: a failed check does not just block, it returns a signal the agent uses to revise, and the cycle runs again on the new attempt. What the verifier needs is independent authority, which is not the same as a different agent: an externally-defined test the working agent runs on its own output still counts, because the standard came from outside even if the runner did not. The weak case is unstructured self-critique on the same basis that produced the work, which tends to confirm its own mistakes. The loop ends when the check passes or when a limit is reached and the work escalates to a human.

Why it matters

A one-time review treats the first output as the output, which is the wrong default for a probabilistic system that is right most of the time and wrong unpredictably. A loop changes the question from "is this output good" to "does this output pass a check I trust", which is testable where the first is a judgment call. The honest limit is that a loop is only as good as its verifier: a check that cannot see a class of errors passes them through as confidently as correct ones, so a green loop with a weak check is a false sense of safety, not safety. Deterministic checks are cheap to run and catch only what they can articulate; judgment-based checks catch more but are themselves probabilistic, so a robust loop usually layers them, a deterministic floor with a judgment layer above it. The loop also costs latency and compute on every iteration, so the discipline is matching the rigor of the check to the cost of being wrong.

In practice

An agent writes a code change and does not consider it done when the edit is applied. A test suite runs against the change, and if a test fails the failure output goes back to the agent, which revises and runs the suite again, looping until the tests pass or a retry limit sends the change to a human. The trustworthy artifact is not the first edit the agent produced; it is the edit that survived the check, and the check, not the agent's confidence, is what the team relies on.

Practical considerations

The first design decision is what the verifier can see, since a loop built around a check that misses the failure modes that matter is busywork dressed as rigor. Deterministic and judgment-based verifiers belong at different layers: put the cheap, unambiguous checks, does it build, does it pass the suite, does it satisfy the schema, at the floor where they run every time, and reserve a model judge or human review for the dimensions a rule cannot express. A loop needs a stopping rule, because an agent that revises indefinitely against a check it cannot satisfy burns cost without converging, so a retry limit and an escalation path are part of the design, not an afterthought. The feedback matters as much as the check, since a verifier that only reports pass or fail gives the agent less to work with than one that reports what failed and why. The loop is also not a substitute for the surrounding delivery controls, since tests, review, and continuous integration still catch what a given verifier was not built to see.

Related standards and prior art

Defined by Ready Solutions AI