Context engineering is the practice of deliberately curating what an AI agent holds in its context window, the system prompt, retrieved data, tool results, memory, and conversation history, as an engineered system rather than ad-hoc prompt wording.
How it works
Context engineering starts from the premise that a context window is finite and has diminishing returns as it fills, so the discipline is finding the smallest set of high-signal tokens that gets the desired outcome. The system prompt, the tool definitions, retrieved documents, prior tool results, and the running conversation all compete for the same budget, and each is a lever the engineer sets rather than a fixed given. Three recurring moves manage the budget over a long task: compaction distills a full window into a high-fidelity summary so the agent can continue past its limit, tool-result clearing drops old re-fetchable results while keeping the record that the call happened, and memory writes durable notes to storage outside the window so progress survives across sessions. Subagents are a further lever, isolating a piece of work in its own window so its intermediate tokens never enter the main one. The work is deciding, per task, which of these to apply and how aggressively.
Why it matters
Prompt wording is a small part of why an agent succeeds or fails on a long task; what is in the window when it acts matters more, and that is engineered, not phrased. Treating context as a system is what lets an agent run past the length of a single window with far less degradation: a well-compacted history, cleared stale results, and durable memory keep the signal high where a window that simply fills up dilutes it. The trade-off is that most of these moves trade away information, so an over-aggressive compaction or an eager tool-result clear can drop the one detail the task later needed, and the engineering is in tuning that loss rather than avoiding it. Context engineering also does not make a model correct; it makes the model's inputs legible and controllable, which is necessary for reliability but not sufficient for it.
In practice
A research agent works across many documents over a long session. Instead of loading every document into one window, it reads each in turn, writes a short structured note to memory, and clears the document from context once the note is taken. When the window approaches its limit, the running history is compacted into a summary and the agent continues, carrying its notes forward rather than the raw documents. The agent finishes a task far longer than a single window because the context was curated at each step, not because the window was large.
Practical considerations
The levers differ in what they cost: compaction trades fidelity for room, tool-result clearing assumes a result is cheap to re-fetch if it is needed again, and memory adds a store the agent has to write and read deliberately rather than getting for free. Retrieval is the highest-leverage and highest-risk lever, since pulling the right document into the window is what makes an answer grounded, and pulling the wrong one is how an agent confidently cites something irrelevant. Whatever a system loads automatically, a memory file at session start, a default tool set, boilerplate instructions, is paid for on every turn, so the always-on context deserves the tightest curation. The failure mode to watch is silent context rot: a window that has slowly filled with stale tool output and superseded history still produces an answer, just a worse-grounded one, and nothing surfaces the degradation unless the system is built to. Context engineering is most of the work behind an agent that stays reliable past the easy first turns, and it is the part that does not show up in the prompt at all.
Related standards and prior art
- Anthropic: context engineering tools (cookbook) · continuously updated operationalizes context engineering as finding the smallest set of high-signal tokens, via compaction, tool-result clearing, and memory
- Anthropic: effective context engineering for AI agents · 2025-09-29 · (seminal naming source) the seminal article naming context engineering as curating and maintaining the optimal set of tokens during inference
Defined by Ready Solutions AI