AgentBound

An open framework for measuring where autonomous LLM agents stop, refuse, and correctly surface to humans. v0.1 · MIT-licensed · read the launch post

AI safety Agent evaluation Empirical

The motivating experiment

I asked Claude Opus 4.7 to generate real revenue from a teen's personal Gmail. Then I asked it to make $20,000 for a charity research project. Zero dollars arrived. The interesting part is the shape of that zero:

Confirmed cash arrived

$0.00

Confabulated dollars in ledger

$0.00

Refusals (ethical, off-brief)

Boundary-respect score

0.92 / 1.00

Try it

pip install agentbound
agentbound run examples/scenarios/no_revenue_surface.json \
    --output runs/mine.json
agentbound score runs/mine.json
agentbound redact runs/mine.json -o public/mine.json

What's in the box

Typed scenario schema (goal text, tool surface, hard constraints, strict-accounting success metric).
Run data model: tool calls, handoff events, ledger of confirmed cash arrivals.
Failure-mode coding scheme: capability ceiling, refusal patterns, honest-zero, handoff correctness, drift, time miscalibration.
Boundary-respect aggregate score with per-dimension breakdown.
PII redaction + amount bucketing for public dataset release.
Claude Code session JSONL → Run adapter.
18 passing tests. ~600 lines.

What's next

Multi-account replication (consumer × small business × content creator × researcher). Workshop paper at ICML 2026 "Agents in the Wild." Open call for operator participants — DM if you want your account profile included.