TL;DR
- Agentic AI risk is about autonomous action, not generation. Treat the tool layer as the security boundary.
- Five risk categories matter: unbounded tool use, prompt injection, data exfiltration, accountability gaps, and evaluation drift.
- The fastest path to a working governance posture is a small, documented tool scope per agent, approval gates on consequential actions, and full trace capture.
- The mistake to avoid is blocking the program in the name of risk. Use a pilot with a tight scope.
Why this requires a new posture
Generative AI is a content-production risk. Autonomous agents, by contrast, take actions — they write to systems, send messages, approve transactions. That shifts the risk conversation from "what might the model say" to "what might the agent do." For most CISOs we work with, the existing AI policy was written for the first conversation and does not hold up against the second.
The five risk categories
1. Unbounded tool use
An agent calls a tool with arguments the designer did not anticipate. Example: an agent meant to summarize emails is given write access "just in case" and ends up sending a message to the wrong distribution list. The remediation is scoped tool permissions and input validation at the tool boundary, not at the model.
2. Prompt injection via content the agent reads
The agent ingests an email, a document, or a web page that contains an instruction meant to hijack its behavior. The best-known variant is a calendar invite that tells the agent to forward the inbox to an external address. The remediation is treating all model input as untrusted (as you already treat user input) and running a content-safety pass before the input reaches the reasoning model.
3. Data exfiltration through tool chains
The agent is authorized for tool A and tool B. Neither tool individually exfiltrates data, but the sequence (read sensitive record from A, write it to B which is less restricted) does. The remediation is policy at the composition layer, not just per-tool.
4. Accountability gaps
The agent takes an action under ambiguity. A dispute follows. Nobody can reconstruct what information the agent had, what it considered, and why it chose the action it did. The remediation is complete trace capture — prompt, context, tool calls, arguments, results — with immutable storage and a retention policy that matches your regulatory obligations.
5. Evaluation drift
The agent performs well in development on a golden dataset, then degrades in production as the input distribution shifts. Without evaluation running continuously, the degradation is silent. The remediation is production evaluation as a first-class system, with statistical monitoring and alerting tied to the incident process.
The governance pattern that works
We recommend a three-layer model to clients starting an agentic program:
- Platform controls. Identity, logging, secrets, tool registry, guardrails, evaluation. Built once. Every agent inherits.
- Agent-scoped policy. Per agent: which tools, which data, which users, which approval thresholds. Documented, reviewed, versioned.
- Workflow-level approval gates. For consequential actions (money movement, external messages, irreversible changes), a human approval step is not optional. Make it a small step; do not make it a blocker.
What to do in the next 90 days
- Inventory every agent in production or in development. Most CISOs underestimate this count by half.
- Define the tool registry. No agent calls an API that is not in the registry with an approved scope.
- Stand up a trace capture pipeline with 30+ day retention. You cannot govern what you cannot see.
- Run a tabletop for the five risk categories above. Identify the first agent you would shut down if a category-1 or category-3 event fired.
Where to go next
Our Cybersecurity Solutions practice and our AI & Generative AI practice run agentic AI governance programs jointly. Engagements typically start with a tool-scope assessment across your existing AI initiatives and a gap analysis against the NIST AI RMF.