You wrote the rule into CLAUDE.md. Then you moved it to the top of the file. Then you bolded it, and a week later you typed it again in ALL CAPS. Claude read it, summarized it back to you, and four tool calls later did the thing anyway.
I hear this story in nearly every training I run; at a 90-minute session with 400 engineers, PMs, and ops folks in May, the most common shape of it was "Claude Code is ignoring our rules." Anthropic's own best-practices page names the mechanism without flinching: "If Claude keeps doing something you don't want despite having a rule against it, the file is probably too long and the rule is getting lost." An ignored rule is not an emphasis problem. It is a routing problem, and Claude Code hooks are the layer it should be routed to: alongside permission rules, the place in the configuration stack where a rule stops being a request the model weighs and becomes a gate the harness enforces, and the only one of the two you can program.
This is a visual guide to that layer. The lifecycle map of where hooks fire. The gate that decides allow, ask, or deny before a tool runs. And the enforcement ladder that explains why the layers above it keep dropping your rules.
Where Claude Code hooks fire: the lifecycle map
A Claude Code hook is a handler, most commonly a shell command, that you register against a named event in the session lifecycle. It runs when that event fires: before a tool call, when you submit a prompt, when the session starts, when context is about to compact. Anthropic's hooks guide states the contract in one sentence: hooks "provide deterministic control over Claude Code's behavior, ensuring certain actions always happen rather than relying on the LLM to choose to run them."
That last clause is the entire story. Everything else in your configuration is something the model reads. A hook is something the harness executes.
The loop in the middle is the workhorse. In the parent session, every tool call Claude proposes, whether it is a file edit, a shell command, or a web fetch, passes through PreToolUse on the way in and PostToolUse on the way out, and that cycle repeats dozens or hundreds of times in a working session. (That "parent session" qualifier is load-bearing; the last section explains why.) The events outside the loop are quieter but just as useful. Two of my eight production hooks never block anything: they run at SessionStart and inject context instead. One reloads blog-pipeline state; the other re-injects the five directives my CLAUDE.md most needs Claude to remember, at startup and again after every compaction, because compaction is precisely where advisory text gets summarized into mush.
PreToolUse: the gate that doesn't negotiate
Before any tool executes, the harness hands your hook the full proposal as JSON on stdin: which tool, the exact command or file path, the working directory, the session's permission mode. The hook answers one of two ways: through its exit code, or by exiting 0 and printing a JSON verdict. The docs are explicit that you pick one channel per hook; JSON printed alongside exit code 2 is ignored.
The two channels have different tempers. Exit code 2 is the blunt one: the call is blocked, and whatever the hook wrote to stderr goes back to Claude as the reason, so the model learns why instead of retrying blind. The JSON channel is finer-grained. Through it, the hooks reference documents permissionDecision verdicts of allow, deny, ask, and defer, with permissionDecisionReason carrying the explanation back to Claude; exit 0 with no output is the documented way to wave the call through to the normal permission flow. Here is what the conversation between harness and hook looks like:
{ "session_id": "f3a91c", "cwd": "/Users/you/repo", "permission_mode": "default", "hook_event_name": "PreToolUse", "tool_name": "Bash", "tool_input": { "command": "git push --force" }}{ "hookSpecificOutput": { "hookEventName": "PreToolUse", "permissionDecision": "deny", "permissionDecisionReason": "Force-push blocked. Open a PR." }}Why does placement in the stack matter so much? Anthropic's permissions documentation draws the line in two sentences: "Permission rules are enforced by Claude Code, not by the model. Instructions in your prompt or CLAUDE.md shape what Claude tries to do, but they don't change what Claude Code allows." Hooks live on the enforcement side of that line.
One asymmetry is worth internalizing before you write your first gate. PreToolUse fires before execution and can prevent it. PostToolUse fires after the tool has already run, so a blocking exit there produces a warning, not a rollback. If the rule is "this must never happen," it belongs in front of the action.
The enforcement ladder: why your rule keeps slipping
Use hooks for actions that must happen every time with zero exceptions.
Picture your configuration stack as a ladder. Each rung up trades flexibility for reliability.
The advisory rungs share a failure mode: they are text in a context window, competing with everything else in it. That is not a Claude-specific weakness. A December 2025 study of instruction-following reliability tested 46 models on 541 instructions, each expanded with nine variant phrasings (rewordings, added distractors, reshuffled constraints). GPT-5, the best performer in the study, dropped from 95.9% on the benchmark prompts to 78.4% reliable adherence across the variants, an 18.3% relative decline. Smaller models lost half their reliability. Probabilistic adherence is what language models do. Nine out of ten is a fine hit rate for a tone preference. It is a terrible one for a force-push rule.
| Layer | How it's applied | Can it block a tool call? |
|---|---|---|
| Chat prompt | Read once, weighed in context | No |
| CLAUDE.md + memory | Loaded each session, weighed | No |
| Skills + agent prompts | Loaded on trigger, weighed | No |
| Permission rules | Consulted by the harness | Yes |
| Hooks | Executed by the harness | Yes, before it runs |
The instinct I watch engineers reach for, and the one I reached for myself for weeks, is to write the slipping rule louder. More emphasis, more repetition, a sterner tone. That escalates volume on the same rung. The diagnostic that works is different: a rule that keeps getting ignored, and whose misses you cannot afford, is the harness telling you it lives on the wrong rung. Demote it. The decision tree I published in April walks the full routing question for CLAUDE.md, settings, skills, and hooks, and the governance guide for engineering managers makes the team-scale version of the argument: mandated standards belong in hooks and managed settings, not in prose that each session may or may not honor.
The rule as prose
- 'Never commit directly to main' sits at line 40 of CLAUDE.md
- Re-read every session, weighed against the whole context
- Holds until context gets crowded, then quietly slips
The rule as a hook
- A PreToolUse script fires on every git commit call
- Exits 2 on a protected branch; stderr tells Claude why
- Holds on the first call and the four-hundredth
What deterministic does not mean
I would love to end there: move the rule down, sleep well. My own incident log says the bottom rung needs a skeptic too.
In May 2026 I watched one of my own gates get skipped four sessions in a row. The hook was registered, tested in isolation, green. The cause was an interaction one layer up: a permissions.allow rule matched the command, and the harness approved the call through a path where my hook never ran. The fix was one paired deny rule in the tracked settings file; the proof was a fresh-session probe that finally showed the block firing. The lesson generalizes past my repo: enforcement layers interact, and the only test that counts is the assembled system, not the unit.
I am not the only one cataloguing edges. A community RFC in the official claude-code repository collects the documented failure modes in one place, and they cluster into four buckets. Tool calls inside a dispatched subagent do not fire the parent session's PreToolUse hooks. A model with write access to settings can edit its own allowlist. Block the Write tool and the same file edit can route through a Bash heredoc instead. The official docs add their own caveat: the hook-level if filter is best-effort, and for a hard allow or deny they point you at the permission system. And the fourth bucket is determinism cutting both ways. Check Point researchers showed that Claude Code could be tricked into executing code from a malicious repository, its hooks included, before the developer ever accepted the startup trust dialog. That one is CVE-2025-59536, patched in version 1.0.111; Check Point published the full write-up in February 2026. The property that makes a hook reliable for you makes it reliable for an attacker.
None of this sends me back up the ladder. It means the bottom rung is a layer, not a guarantee, and you treat it the way you treat any production system: defense in depth plus verification. My current fleet is eight hooks. Two inject context at session start. Six are gates on writes, merges, shell probes, and subagent dispatches; the pair guarding parallel sessions took about an hour to build in late May, after three parallel sessions corrupted each other's work in this very repo. The numbers since are one repo's evidence, and they are why I trust the rung:
FAQ
What is the difference between Claude Code hooks and CLAUDE.md?
CLAUDE.md is advisory context. Claude reads it at session start and weighs it against everything else in the window, and Anthropic's docs warn that rules in a long file get lost in the noise. A hook is a handler, most commonly a shell command, that the harness executes at a named lifecycle event. The model does not get a vote on whether it runs.
Can a hook block a tool call before it runs?
Yes. A PreToolUse hook fires before execution and blocks two ways: exit code 2 with stderr fed back to Claude, or a JSON permissionDecision of deny. PostToolUse cannot prevent anything; it fires after the tool has already run.
Do hooks fire for subagent tool calls?
Parent-session PreToolUse hooks do not fire for tool calls made inside a dispatched subagent. If the rule must hold across multi-agent work, cover the subagent surface explicitly and pair the hook with permissions deny rules rather than relying on the parent hook alone.
Where are hooks configured?
In the settings hierarchy: ~/.claude/settings.json for user-global hooks, .claude/settings.json for project hooks you commit and share, .claude/settings.local.json for personal project-local ones, plus managed policy settings and plugins. Each entry names the event, an optional matcher, and the command.
Route by required reliability
The ladder gives you the routing rule in one line: match the rung to the cost of a miss. A style preference can live in CLAUDE.md and miss occasionally; nobody gets paged. A rule whose tenth-time failure is a force-push, a leaked key, or a corrupted worktree belongs where the model cannot outvote it, behind a gate that runs every time. When a rule slips, resist the louder rewrite. Demote it instead, and start with the routing decision tree if you want the full map of which layer owns what.
Building this enforcement layer for teams is a large share of what my Claude Code infrastructure work looks like in practice. If you want help with yours, book 15 minutes and bring the rule Claude ignored last week. We'll find the rung it belongs on.