LRU cache - Rushi's

Jun 06

2026

Your agentic app just ran a search. The tool returned 500 results as JSON. Your agent appended all of it and fired off an API call — 45,000 tokens to answer a question that needed maybe 4,500. Tejas Manohar, a senior engineer at Netflix, hit this problem every day. He was running out of tokens […]

Rushi's

Ctrl+AI+Ship

Tag: LRU cache

Headroom: Cutting LLM Token Costs Without Cutting Answers