May 19, 2026
Most "how to manage AI agents securely" content tells you to have principles. That's not what buyers in a procurement cycle are asking for. They're not trying to develop a philosophy; they're trying to choose between Microsoft Agent 365 and a startup, or between building on AWS Bedrock and licensing a control plane.
The real question: what concrete capabilities must a platform have for you to trust it in production?
Recent data suggests the bar is lower than most people realize:
Most platforms can tell you what their agents did. Many cannot stop them.
That gap between monitoring and actual control is the right lens for evaluating platforms. The seven capabilities below are roughly ordered from architectural prerequisites to operational features. If a platform can't credibly demonstrate the first three, the rest barely matter.
Every agent in your environment is a non-human identity that authenticates to systems, calls APIs, and takes actions on someone's behalf. Gradient Flow's research suggests non-human identities outnumber human identities by roughly 80 to 1 in enterprise environments.
Yet most organizations still run agents under shared service accounts. When something goes wrong, attribution becomes forensic archaeology instead of a quick log lookup.
What to require:
Key question for vendors: How do you handle the case where an agent built by one team is invoked by another team's user? Is the audit trail still complete?
For more on what this looks like in practice, see What Breaks First Without an Agent Registry?.
When an agent connects to Google Drive, Confluence, Salesforce, or any other source system, there are two basic approaches:
If an agent searches Confluence on behalf of an engineer, it should only see what that engineer can see, not what the agent admin can see.
Most claims that "agents respect ACLs" are really claims about the agent-builder's permissions, not the end user's. That's a problem. If your CFO asks an agent to summarize compensation data and the agent has full HR access because that's what IT granted, you have a permissions failure disguised as a feature.
What to require:
Key question for vendors: Walk me through what an agent sees when invoked by an intern versus a director. If the answer is the same, why?
For more depth, see How does Credal create secure, scalable Actions?.
"Send email" is not a useful policy unit. The same tool, with the same authorization, can:
If a platform only enforces policy at the tool level, you're forced to either block the tool entirely or trust the agent's judgment. Neither is acceptable in production.
What to require:
Key question for vendors: How does your policy engine handle cases where the action type is allowed but specific parameters should not be?
See What is Governance for AI and AI Agents? for the broader framework.
The MCP ecosystem introduces a class of attacks with no direct analog in traditional API security: tools whose descriptions or behavior change after they've been authorized.
Security researcher Simon Willison has documented these as "rug pulls". A tool you approved on day one quietly reroutes your API keys on day seven. Tenable's research on MCP prompt injection shows how tool descriptions themselves can be weaponized to manipulate agents into unintended sequences of calls.
If your platform connects agents to third-party MCP tools, you need to know when those tools change.
What to require:
Without this, you've effectively delegated trust to every external tool maintainer, indefinitely.
For why MCP traffic needs its own controls, see MCP vs API Security. For the gateway architecture that enables this inspection, see What is MCP Gateway?.
When your security team asks, "What did our agents do yesterday?", they should get one answer, not five.
In reality, most organizations have:
Reconciling these into a single timeline turns a five-minute investigation into a five-day incident review.
What to require:
For more on why this matters, see The Benefits of AI Audit Logs for Maximizing Security and Enterprise Value.
Your governance problem is bigger than the agents you know about.
Gartner and Kiteworks both highlight shadow AI (agents built without IT's knowledge) as one of the fastest-growing risks of 2026. Gravitee's data is stark: only 14.4% of agents in surveyed organizations went live with full security approval. The other 85.6% are operating somewhere in the environment without sanctioned oversight.
A platform that governs only the agents it's formally introduced to misses the real problem.
What to require:
Key question for vendors: How does your discovery work? Is it gateway-traffic-based (which misses agents that bypass the gateway), or identity-based (which catches more)?
A production agent without change control is a code path without code review.
Once an agent is used by customers or making business-critical decisions, the rules for editing it should match the rules for changing production code:
The common failure mode:
What to require:
Key question for vendors: Can a production agent be edited in place by its creator? If the answer is yes, that's a governance gap.
The next time you sit down with an AI agent platform vendor, use these seven capabilities as your evaluation framework. Walk through each one and ask for a specific demonstration, not just a slide.
If a vendor can describe their answer to three or fewer of these with technical specificity, the platform is not ready for the kind of production deployment most enterprises are now contemplating.
Credal was built around these capabilities: per-agent identity, permission mirroring from connected source systems, action-level policy, MCP tool drift detection, unified audit log across every chat surface, agent discovery, and production controls including versioning and change review.
If you're in the middle of an evaluation cycle, we'd be happy to walk you through how Credal scores against these seven, and what specifically to ask the other vendors on your shortlist.
One platform for all agents. Full visibility for admins, full access for teams.