↑
Quantifying LLM Cost Savings from Cache-Aware Inference Routing
Posted by
zxy-action
|
2 hours ago |
1 comments
zxy-action 2 hours ago
I’m the founder of Auriko. We ran this study to measure how much cache-aware llm routing can reduce inference costs.
Critique welcome.