Most of the text flowing through an agent’s context window isn’t code, reasoning, or instructions. It’s logs. Table of contents The problem nobody talks about Here’s something I’ve been noticing while watching AI coding agents work. You ask Cursor or Claude Code to fix a failing test. The agent runs the test suite. The test […]

Read More →

Large Language Models (LLMs) all predict text, but they differ a lot in how they follow instructions, use context, handle tools, and optimize for safety, speed, or cost. If you treat them as interchangeable, you’ll ship brittle prompts. If you treat them as different runtimes with different affordances, you’ll get reliable results. This post explains the major differences across […]

Read More →