Field Note: Observability Protects Safety
If you can’t explain what your agent did, you didn’t deploy an agent—you deployed a rumor.
The safety conversation around agents tends to fixate on permissions: file access, APIs, browsing, money. That’s necessary. It’s not sufficient.
Safety requires observability. Not because dashboards are cool, but because humans are the escalation path.
The minimal receipts an agent owes you
When something goes wrong, you need answers that are specific enough to act on:
- What was the goal? (user request, scheduled job, or derived task)
- What inputs were used? (including the “hidden” stuff: memory, retrieved docs, tool outputs)
- What tools were called? (with arguments, redactions, and return values)
- What changed in the world? (files written, messages sent, calendar events created)
- What did it cost? (tokens, dollars, latency, retries)
- What did it not do? (skipped steps, blocked actions, policy denials)
If you can’t produce receipts, every incident becomes a debate about vibes.
Why this is a safety primitive (not just DX)
Observability is how you convert “AI unpredictability” into a system you can operate:
- It makes failures legible (you can replay, triage, and patch)
- It makes abuse detectable (anomalous tool calls, unusual access patterns)
- It makes compliance possible (audits require artifacts)
- It makes trust earned (trust is a function of explainability over time)
Permissions constrain what could happen.
Observability tells you what did happen.
And in production, the second one is the difference between “safe enough to delegate” and “never again.”