All Blog Posts

AI Agent Security: 7 Capabilities to Require From Any Platform You Evaluate

by

Jessica Shen

May 19, 2026

AI Agent Security: 7 Capabilities to Require From Any Platform You Evaluate

Most "how to manage AI agents securely" content tells you to have principles. That's not what buyers in a procurement cycle are asking for. They're not trying to develop a philosophy; they're trying to choose between Microsoft Agent 365 and a startup, or between building on AWS Bedrock and licensing a control plane.

The real question: what concrete capabilities must a platform have for you to trust it in production?

Recent data suggests the bar is lower than most people realize:

Gravitee's 2026 survey of 900+ executives and practitioners found 88% of organizations had confirmed or suspected AI agent security incidents in the past year.
Kiteworks' 2026 Data Security and Compliance Risk Forecast found that 60% of organizations cannot terminate a misbehaving agent once it has started operating, and 63% cannot enforce purpose limitations on what their agents are authorized to do.

Most platforms can tell you what their agents did. Many cannot stop them.

That gap between monitoring and actual control is the right lens for evaluating platforms. The seven capabilities below are roughly ordered from architectural prerequisites to operational features. If a platform can't credibly demonstrate the first three, the rest barely matter.

1. Per-agent identity, not shared credentials

Every agent in your environment is a non-human identity that authenticates to systems, calls APIs, and takes actions on someone's behalf. Gradient Flow's research suggests non-human identities outnumber human identities by roughly 80 to 1 in enterprise environments.

Yet most organizations still run agents under shared service accounts. When something goes wrong, attribution becomes forensic archaeology instead of a quick log lookup.

What to require:

Every agent has a distinct identity.
Every action is attributable to that agent and the human user who delegated to it.
Credentials are centrally managed, not scattered across config files and source code.

Key question for vendors: How do you handle the case where an agent built by one team is invoked by another team's user? Is the audit trail still complete?

For more on what this looks like in practice, see What Breaks First Without an Agent Registry?.

2. Source-system permission mirroring

When an agent connects to Google Drive, Confluence, Salesforce, or any other source system, there are two basic approaches:

Shallow approach: give the agent a broad service account with wide access.
Correct approach: inherit the calling user's actual permissions in that source system.

If an agent searches Confluence on behalf of an engineer, it should only see what that engineer can see, not what the agent admin can see.

Most claims that "agents respect ACLs" are really claims about the agent-builder's permissions, not the end user's. That's a problem. If your CFO asks an agent to summarize compensation data and the agent has full HR access because that's what IT granted, you have a permissions failure disguised as a feature.

What to require:

The agent's effective access mirrors the invoking user's permissions in each source system.
Different users (e.g., intern vs. director) see different data through the same agent.

Key question for vendors: Walk me through what an agent sees when invoked by an intern versus a director. If the answer is the same, why?

For more depth, see How does Credal create secure, scalable Actions?.

3. Action-level policy enforcement

"Send email" is not a useful policy unit. The same tool, with the same authorization, can:

Do something benign (send an internal status update), or
Do something high-risk (forward customer data to an external address).

If a platform only enforces policy at the tool level, you're forced to either block the tool entirely or trust the agent's judgment. Neither is acceptable in production.

What to require:

Policy that scopes by the content of the action parameters, not just the action type.
Examples:
An agent can send internal emails without approval, but external emails trigger human review.
An agent can update Salesforce opportunities only when the stage is "Qualification", not when it's "Closed-Won".

Key question for vendors: How does your policy engine handle cases where the action type is allowed but specific parameters should not be?

See What is Governance for AI and AI Agents? for the broader framework.

4. Tool drift protection

The MCP ecosystem introduces a class of attacks with no direct analog in traditional API security: tools whose descriptions or behavior change after they've been authorized.

Security researcher Simon Willison has documented these as "rug pulls". A tool you approved on day one quietly reroutes your API keys on day seven. Tenable's research on MCP prompt injection shows how tool descriptions themselves can be weaponized to manipulate agents into unintended sequences of calls.

If your platform connects agents to third-party MCP tools, you need to know when those tools change.

What to require:

Detection when a tool's description or parameters change after authorization.
Automatic re-review requirements before the changed tool can be used again.

Without this, you've effectively delegated trust to every external tool maintainer, indefinitely.

For why MCP traffic needs its own controls, see MCP vs API Security. For the gateway architecture that enables this inspection, see What is MCP Gateway?.

5. Unified audit log across surfaces

When your security team asks, "What did our agents do yesterday?", they should get one answer, not five.

In reality, most organizations have:

Logs in their MCP gateway
Logs in each chat surface (Claude, ChatGPT, etc.)
Logs in their orchestration layer
Logs in each source system

Reconciling these into a single timeline turns a five-minute investigation into a five-day incident review.

What to require:

A platform-level audit log that captures:
Every prompt
Every tool call
Every approval decision
Every data access
Each event tagged with both user identity and agent identity.
Easy export to your existing analysis stack (Snowflake, BigQuery, SIEM, etc.).

For more on why this matters, see The Benefits of AI Audit Logs for Maximizing Security and Enterprise Value.

6. Agent discovery and shadow agent visibility

Your governance problem is bigger than the agents you know about.

Gartner and Kiteworks both highlight shadow AI (agents built without IT's knowledge) as one of the fastest-growing risks of 2026. Gravitee's data is stark: only 14.4% of agents in surveyed organizations went live with full security approval. The other 85.6% are operating somewhere in the environment without sanctioned oversight.

A platform that governs only the agents it's formally introduced to misses the real problem.

What to require:

Continuous discovery of agents operating in your environment across:
SaaS tools
Browser extensions
Internal applications
MCP server connections
Coverage regardless of whether they were built on the platform itself.

Key question for vendors: How does your discovery work? Is it gateway-traffic-based (which misses agents that bypass the gateway), or identity-based (which catches more)?

7. Production controls

A production agent without change control is a code path without code review.

Once an agent is used by customers or making business-critical decisions, the rules for editing it should match the rules for changing production code:

Versioning
Change review
Publish gates
Immutable production configurations

The common failure mode:

An agent ships and becomes load-bearing for a workflow.
Someone with edit access tweaks the prompt to fix an edge case.
No one reviews the change, tests regressions, or provides a rollback path.
The "quick fix" becomes the root cause of next week's incident.

What to require:

Clear separation between development, staging, and production versions of agents.
Change review and approval before updates hit production.
Immutable configs for production agents, with explicit versioned releases and rollbacks.

Key question for vendors: Can a production agent be edited in place by its creator? If the answer is yes, that's a governance gap.

Use this as your scorecard

The next time you sit down with an AI agent platform vendor, use these seven capabilities as your evaluation framework. Walk through each one and ask for a specific demonstration, not just a slide.

If a vendor can describe their answer to three or fewer of these with technical specificity, the platform is not ready for the kind of production deployment most enterprises are now contemplating.

Credal was built around these capabilities: per-agent identity, permission mirroring from connected source systems, action-level policy, MCP tool drift detection, unified audit log across every chat surface, agent discovery, and production controls including versioning and change review.

If you're in the middle of an evaluation cycle, we'd be happy to walk you through how Credal scores against these seven, and what specifically to ask the other vendors on your shortlist.

All Blog Posts

Give every team access to governed agents

One platform for all agents. Full visibility for admins, full access for teams.

Ready to dive in?

Get a demo