logo

Show HN: I decomposed 87 tasks to find where AI agents structurally collapse

Posted by XxCotHGxX |3 hours ago |1 comments

XxCotHGxX 3 hours ago

I’ve been tracking the "AI is taking all remote work" narrative, but it felt like the raw data was missing a major variable: Structural Complexity.

I audited 87 micro-tasks using the Scale AI RLI benchmarks and O*NET labor baselines. By decomposing 10 large projects into their atomic requirements, I found a sharp "Complexity Kink" at 10^3 instruction entropy. Below this line, agents are highly applicable; above it, their marginal productivity collapses, and the "Human Agency Moat" actually widens.

Technical details:

  - Metric: Instruction Entropy (E) = Solution_Tokens / Requirement_Tokens
  - Model: Heckman Two-Step (Correcting for benchmark selection bias, p=0.012)
  - Robustness: Rosenbaum Sensitivity Analysis (Stable up to Gamma = 1.8)
  - Standardization: Applied 5% Winsorization to normalize for "human slop" in self-reported freelance durations.
The full paper (PDF) and code are in the repo.

I'd love to hear from people building production agents—are you seeing this 10^3 threshold in your own error logs, or am I over-simplifying the "connective tissue" problem?