A fail-open gate is an enforcement control around an AI agent that lets the action proceed when the control itself fails, whether it was skipped, returned the wrong signal, fired on the wrong route, or errored silently, leaving a system that looks guarded in configuration while being unguarded in behavior.
How it works
Security engineering has long named the weakness: a system that falls back to a less secure, more permissive state when it hits an error condition is failing open, and the inverse requirement, fail-secure, terminates function in a way that preserves the secure state. Agentic enforcement stacks inherit the same physics at every layer, an interceptor whose error or wrong exit signal is read as proceed, a deny rule a different code path never consults, an approval step that a crafted input routes around, a validator that passes when its input does not parse, classic permissive-on-error handling and never-consulted routing bypasses alike, named together here because the observable result is identical. The defining property is the asymmetry between appearance and behavior: the control is present in configuration, visible in review, and absent at the moment of enforcement. Documented incidents and disclosed vulnerabilities trace the same shapes: an approval check that did not inspect commands wrapped in substitution syntax, policy enforced in the interface but missing at the API the agent actually calls, and permissive defaults inherited silently. What makes the agentic case sharper is the caller: a tireless agent exercises the gate at machine rate, so a fail-open route is not a rare unlucky path but one that gets more probable with every run, fastest under broad autonomy, varied inputs, or adversarial pressure. The countermeasure is a design default, deny on the gate's own failure, plus testing the gate the way the agent actually reaches it, because configuration review cannot see runtime routing.
Why it matters
A fail-open control subtracts safety rather than merely failing to add it, because the design leans on it: reviewers sample less, autonomy widens, and the audit trail keeps certifying protection in proportion to how much the gate no longer provides. Standards guidance for agent security is explicit on the requirement: deny the operation when risk classification, approval validation, policy lookup, or audit logging fails, rather than proceeding while the safety machinery is down. The class is invisible to health signals that only count executions and blocks, since a gate that never blocks looks identical to a gate that never needed to block; making it visible takes deny probes, coverage counters, and alerts on the absence of expected denials. Fail-closed has a real cost, availability: a gate that blocks on its own uncertainty stalls legitimate work every time it errors, which is why teams quietly configure fail-open and why the failure class persists. The honest framing is a trade made explicitly, layer by layer: fail closed and pay in uptime, or fail open and pay in the assurance the rest of the system was built on.
In practice
A team registers a pre-execution hook that blocks writes to protected paths, and a later configuration change adds a broad allow rule that short-circuits rule evaluation before the hook is consulted. Every component-level inspection looks correct: the hook is registered, the rule list contains the deny, and the tests that call the hook directly all pass. A scheduled gate probe, an attempt at exactly the write the gate exists to block, comes back approved, which is the one signal in the whole stack that tells the truth. The fix is small; without a precedence audit or a calling-side probe, finding it usually waits for the incident.
Practical considerations
Default enforcement layers that guard irreversible or high-consequence actions to deny on their own error, timeout, or unparseable input; advisory and detective layers can fail soft, shedding privilege or queuing work instead of blocking, so long as the choice is recorded rather than inherited. Verify precedence end to end, because an allow rule that short-circuits evaluation before a blocking check runs is a fail-open route even when every individual rule is correct in isolation. Test gates from the calling side on a schedule, with attempts that should be blocked, since a gate's own unit tests can't see the routing around it. Treat a long silence of block events as a prompt to probe rather than as evidence of compliance. Where fail-closed is genuinely too costly for a layer, name the fail-open choice explicitly and compensate with an independent layer behind it. Audit the audit, because a logging layer that fails open mints gaps precisely where the record matters most.
Related standards and prior art
- MITRE CWE-636: not failing securely (failing open) · continuously updated the named weakness class: a design that falls back to a less secure, more permissive state on error, listed under the alternate term failing open
- NIST CSRC glossary: fail secure · continuously updated the inverse termination mode: system function ends in a way that prevents loss of the secure state when a failure occurs or is detected
- OWASP: AI agent security cheat sheet · continuously updated the agent-specific design requirement: fail closed when risk classification, approval validation, policy lookup, or audit logging fails
- OWASP GenAI: exploit round-up report Q1 2026 · 2026-04-14 quarterly incident analysis documenting production agent enforcement layers that did not hold, including approval gates that destructive actions ignored
- Fog Security via Help Net Security: Amazon Quick chat agent authorization bypass · 2026-05-12 a documented production instance of the interface-versus-API shape: restrictions enforced in the user interface while the agent-facing API behind it carried no authorization check
- Oso: AI agents gone rogue incident registry · continuously updated a maintained registry of agent enforcement failures, including a disclosed command-validation bypass via shell process substitution that let an agent skip the human approval step
- Claude Code docs: hooks reference · continuously updated a production interceptor contract in which blocking requires the documented signal for that event and most other failures proceed, the shape that makes a wrong signal a silent pass
Defined by Ready Solutions AI