What happens when capability decouples from credentials?

Posted by falsework |2 hours ago |2 comments

thenaturalist 2 hours ago[1 more]

Adverserial work (be it agent or human).

The one difference between "can do" and "should be trusted to do" is the ability to systematically prove that "can do" holds up close to 100% of task instances and under adverserial conditions.

Hacking and pentesting are already scaling fully autonomously - and systematically.

For now, lower level targets aren't yet attractive as such scale requires sophisticated (state) actors, but that is going to change.

So building systems that white-hat prove your code is not only functional but competent are going to be critical not to be ripped apart by black-hat later on.

One nice example that applies this quite nicely is roborev [0] by the legendary Wes McKinney.

0: https://github.com/roborev-dev/roborev