Pillar 3 of 4

Cost: the bill is the truth.

A model that ships is a model someone is paying for. Cost is the C that decides whether your agent stays in production after the first invoice arrives.

Why cost stops being a footnote

In analytics, cost showed up once a month and someone optimised a query. In agentic AI, cost shows up every time a user hits enter. A loop that calls a frontier model five times for a single task can quietly turn a $59 customer into a $590 loss. The teams that survive the next two years are the ones who treat cost as a first-class design constraint, not a Q4 cleanup.

Where the money actually goes

  • Tokens: input dominates. Long contexts are paid for on every turn.
  • Loops: agents that retry, replan, or self-critique multiply spend per task.
  • Tool calls: retrieval, code execution, and external APIs each carry their own bill.
  • Latency tax: faster models cost more; slower ones cost users.

Right-sized intelligence

Not every task needs the best model. Routing — small model first, escalate on uncertainty — keeps quality high and the bill low. Cascades and verifier loops let you spend on the answers that matter and save on the ones that do not. The book lays out a routing pattern that any team can ship in two weeks.

Budget-aware design

Give every agent a budget the way you would give a junior employee a credit card. Per-task, per-day, per-tenant. Refuse gracefully when the budget runs out. Watch how this single discipline changes how your team designs prompts, tools, and retrieval.

Chapters in this pillar

Go deeper than a page.

The Context Advantage is the full 31-chapter living book on Context, Control, Cost, and Choice — written for data + AI professionals.