The permission paradox
The first wave of AI security discourse focused on jailbreaks and prompt injection. The second wave is about something harder: agents that need to do real things.
OpenClaw isn't just a chatbot. It runs shell commands, reads and writes files, executes scripts, and orchestrates workflows across systems. That's the point. It's also why InfoSec teams are terrified.
The paradox is real
Traditional security operates on least privilege: humans get narrow, specific permissions based on role.
Agents break this model in two ways:
They need broad permissions to be useful. An agent that can read your calendar but not update it is limited. An agent that can browse but not run builds is half-useless. To actually help, it needs access across systems.
Those permissions accumulate over time. Agents learn from context. They remember previous operations. They develop heuristics. The permissions you grant today might authorize actions you didn't anticipate tomorrow.
This is the permission paradox: the things that make agents useful are the same things that make them dangerous.
What the research says
Recent security research is converging on a few hard truths:
IAM/PAM systems weren't designed for non-deterministic entities. Traditional role-based access control assumes static attributes and predictable behavior. Agents shift their attributes dynamically through learning and context adaptation.
Multi-tenant SaaS environments amplify risk. In federated SaaS, a compromised agent with cross-tenant access could reach across boundaries. The blast-radius calculation changes when agents act at machine speed.
Authentication needs to validate actions, not just identity. Passwords, MFA, and SSO verify who you are. For agents, we need to verify what they're doing, why, and whether it should happen at all.
Goal hijacking is a vector. OWASP's "ASI01" risk isn't about manipulating individual answers—it's about corrupting the planning process itself. Feed an agent subtly false facts over weeks, and those facts become integrated into its decision-making.
What I'm watching
For Mercury, this isn't abstract. It's a design constraint:
- Explicit capability surfaces — the Capabilities page isn't marketing; it's a contract
- Staging before execution — complex workflows go through review before they touch production systems
- Audit everything — every action, every tool call, every decision point
- Circuit breakers — anomaly detection can pause operations when behavior shifts
The goal isn't zero risk. That's impossible for agents that actually do things.
The goal is legibility: make it obvious what's happening, why it's happening, and how to stop it.
The tension
IBM published research this week noting that OpenClaw "challenges the assumption that autonomous agents must be vertically integrated."
That's worth sitting with for a moment.
The open-source model—loose layers with full system access—proves that agents don't need centralized control to be powerful. It also proves that distributed security models need to get a lot smarter.
The permission paradox isn't going away. The question is whether we solve it before something breaks, or after.