A review bar is the explicit evidence standard an AI-authored change must clear before it ships, owned by a named reviewer or role and defined independently of the checkpoint that enforces it: which checks must pass, what a reviewer must actually read, and what claim an approval certifies.

How it works

The bar is the standard and the gate is the mechanism that enforces it, and the two are separable: platform features like required status checks, approval counts, and code-owner routing can enforce the parts of a bar that are expressible as checks, counts, and routing, but they cannot decide what the bar should be. The classic statement of an owned bar comes from code-review practice: approve only when the change definitely improves the overall health of the system, with the reviewer carrying ownership and responsibility for what they approve. Agent authorship adds the pressure the bar has to survive: agent-authored changes arrive frequent, large, and polished-looking, and reviewers report feeling better about approving them, so without size caps and calibration the de facto standard falls without any policy ever being changed. An owned bar therefore names its clauses explicitly: tests not weakened or deleted, edge cases exercised for the touched paths, security boundaries validated, oversized changes rejected as unreviewable. It also names which clauses automation certifies and which a human must read, so the human clause stays small enough to honor at volume. Accountability completes it: someone owns the bar, with standing to change it and the obligation to defend what an approval under it means.

Why it matters

Review evidence is what everything downstream leans on: an audit trail that says a human approved this assumes the approval certified something, and a drifted bar quietly converts that evidence into noise. Substituting machine review for the human clause lowers the bar silently unless the tool is proven against it: on benchmarks built from human-review ground truth, review agents resolve only a minority of the cases and weight different aspects, evidence the benchmark authors read as positioning them as complements rather than replacements. The bar is also a budget: maintaining it at agent-volume is real labor, unevenly distributed across changes, and an unowned bar degrades to whatever the busiest reviewer can sustain that week. Naming the bar separately from the gate gives drift a reference point: the gate's configuration can be identical in January and June while approvals certify far less in practice. The honest limit is that a bar is a floor on evidence, not a guarantee of quality: a change can clear every stated clause and still be wrong in a way no clause anticipated.

In practice

A team turns on agent-authored pull requests with a merge gate requiring green builds and one approval, and within weeks an approval means little more than a click. They write the bar down: quality thresholds may not be weakened, deleted tests block, touched paths need exercised edge cases, security-sensitive files route to their owners, and changes above a stated size are split before review. The mechanical clauses become required checks, and the human clause shrinks to reading the flagged sections and the consequential parts of the diff. Approval volume stays the same, but an approval once again certifies something a teammate can rely on.

Practical considerations

Write the bar as claims an approval certifies, not as exhortations to review carefully, each clause carrying its evidence and its outcome on violation, kept versioned next to the gate configuration it drives, because only claims can be checked, delegated, or audited. Promote high-signal clauses a machine can hold into required checks, run newer or noisier ones in advisory mode until they prove out, and treat the remainder as the human clause, sized to be honorable on the worst day. Route by risk, since code-owner rules concentrate scarce review depth on the paths where a missed defect is expensive. Watch for drift signals, approvals getting faster while changes grow larger, review comments thinning, and make them visible by sampling approved changes against the bar on a schedule rather than trusting the written document. Revisit the bar on a cadence as agent volume grows, because a standard set for human-authored volume will not survive machine-authored volume unchanged.

Related standards and prior art

Defined by Ready Solutions AI