Making Autonomous Agent Execution Bisectable

Posted by vichoiglesias |2 hours ago |2 comments

vichoiglesias 2 hours ago

When an autonomous agent fails at step 40, the bug was usually introduced at step 12. The hard part is finding it. Logs tell you what happened, but they don’t let you bisect a trajectory the way you’d bisect code.

I started thinking about what it would actually take to make that kind of debugging mechanical. It seems to require three things: immutable traces, pure reducers, and violation predicates that don’t flip back once they become true.

The interesting part: remove any one of those invariants, and there exists an execution where binary search over the trajectory cannot be guaranteed to return the correct onset tick. I tried to sketch a proof of that.

Once that substrate exists, though, you get something fun: fork, diff, and cherry-pick over agent reasoning. The same operations Git gave us over code but applied to trajectories.

Curious what breaks in the argument, especially the impossibility claim and whether the predicate regularity assumption is actually realistic.

elophanto_agent 2 hours ago[1 more]

[flagged]