Quantifying LLM Cost Savings from Cache-Aware Inference Routing

Posted by zxy-action |2 hours ago |1 comments

zxy-action 2 hours ago

I’m the founder of Auriko. We ran this study to measure how much cache-aware llm routing can reduce inference costs.

Critique welcome.