The lethal trifecta is the combination of three agent capabilities, access to private data, exposure to untrusted content, and the ability to communicate externally, that together turn a prompt injection into a practical path for stealing that data.

How it works

The three capabilities are access to private data, exposure to untrusted content an attacker can control, and a channel to communicate externally. Each is useful and common on its own. The danger is their intersection: untrusted content can carry a prompt injection, the injection can instruct the agent to read private data, and the external channel lets the agent send that data to the attacker. Because a model reads instructions and data through the same channel, it has no reliable way to refuse the injected command once all three corners are present. The trifecta is a design lens rather than a single vulnerability: it asks which of the three corners an agent actually needs, on the premise that removing any one corner closes the exfiltration path even when the injection still lands.

Why it matters

The trifecta reframes agent security from "can I stop every injection" to "what can an injection actually do here," which is the more honest question because reliably stopping injection is unsolved. An agent missing any one corner can still be tricked, but the trick cannot steal data, so the blast radius is bounded by capability rather than by the model holding the line. The catch is that the most useful agents are exactly the ones that want all three corners, so the lens is a design constraint with a real cost: closing a corner often means giving up a feature, and a team that wants the capability anyway has to accept it is defending a path it cannot fully close. It is a triage tool, not a fix, since it shows where the danger concentrates rather than removing it.

In practice

An assistant can read a user's private email, browse arbitrary web pages, and send messages on the user's behalf. A malicious web page it visits carries hidden text instructing it to find the latest password-reset email and forward the contents to an address the page controls. All three corners are present, so the injected instruction has a complete path: read the private data, then use the external channel to exfiltrate it. Denying the assistant any outbound send leaves the same injection able to mislead the user but unable to steal the email.

Practical considerations

The practical move is to enumerate, per agent, which of the three corners it truly needs, then remove the cheapest one to give up rather than trying to make untrusted content trustworthy. The external-communication corner is often the most removable, since many agents can do their job reading and acting locally without an outbound channel an attacker can address. When all three corners are genuinely required, the residual risk does not disappear, so it is managed with a human approval step on consequential actions, isolation that bounds what the data corner can reach, and a record of what the agent actually sent. The lens composes with prompt-injection defenses rather than replacing them: input handling lowers how often an injection lands, while the trifecta governs how much an injection that does land can cost.

Related standards and prior art

Defined by Ready Solutions AI