More tokens, less cost: why optimizing for token count is wrong

Posted by nicola_alessi |2 hours ago |5 comments

alexbuiko 2 hours ago[1 more]

This is a brilliant breakdown of the 'Token Mix' paradox. It aligns perfectly with what we’ve been seeing while developing SDAG.

When you optimize for a structured context payload (like your dependency graph), you aren't just hitting the Anthropic pricing cache—you are literally reducing the routing entropy at the inference level. High-noise inputs force the model into 'exploratory' output paths, which isn't just expensive in dollars, but also in hardware stress.

We found that 'verbose orientation narration' (the thinking-out-loud part) correlates with higher entropy spikes in memory access. By tightening the input signal-to-noise ratio, you're essentially stabilizing the model's internal routing. Have you noticed any changes in latency variance (jitter) between the pre-indexed and ad-hoc runs? In our tests, lower entropy usually leads to much more predictable TTFT (Time To First Token).

gnabgib 2 hours ago[1 more]

You're over doing the self-promotion (this is the 7th time you've submitted vexp), share something with us you're curious about that you didn't build.

> Please don't use HN primarily for promotion. It's ok to post your own stuff part of the time, but the primary use of the site should be for curiosity.

https://news.ycombinator.com/newsguidelines.html

verdverm 24 minutes ago

tl;dr AGENTS.md and the Anthropic post about putting MCPs behind search are a winning idea right now