Chapter 03Part 1

Smart Models Still Need Smart Systems

12 min read

Story

The demo had been spectacular. A single chat box, an entire bank's data behind it, answers in under two seconds. Two months later, the same chat box was wired into production and a customer service rep watched it confidently tell a customer she was approved for a loan top-up.

She was not. She had been flagged by the risk team six days earlier. That flag lived in a different system, behind a different login, owned by a different group. Nobody had told the agent that system existed. The agent had not lied. It had simply done its job inside the small slice of the world it could see — and the world was much bigger than that slice.

Nothing in the model was broken. Everything in the enterprise was. The room afterward did not argue about the model. They argued about the map.

The model was a brilliant guest in a house with twenty rooms, no map, and several locked doors. It had walked through the unlocked rooms confidently. It had no idea the locked rooms existed. Most enterprise AI projects, regardless of the model they pick, are some version of that same guest.

The real problem

The demo worked because the demo was a small world. One database. One set of rules. One user. One question at a time.

Your company is the opposite. Dozens of source systems. Conflicting definitions. Role-based access. Regulatory rules that change by country. Approval workflows that depend on amount, region, and customer tier. Legacy data that no one fully owns.

A model can be brilliant and still be useless inside that complexity, because intelligence is not the bottleneck. Integration with the real organization is.

The simple idea

The agentic era needs more than smart models. It needs four foundations working together.

Context, so the agent understands what your business means. Control, so the agent acts safely inside your rules. Cost, so the agent does not quietly burn your budget. Choice, so you are not locked into one vendor's future.

In this book we call these the four C's. They are not a product. They are not a framework you bolt on at the end. They are the decisions that determine whether your AI is trusted, safe, affordable, and future-ready, or none of those things.

From the field

A North American insurer ran a six-month review of every AI initiative across the company. Twenty-two projects. Some shipped, some stalled, some quietly abandoned.

The team scored each project on a one-to-five scale across the four C's. Every project that shipped successfully had scored at least a three on all four. Every project that stalled had at least one C at one or two. The pattern was not subtle: a project could survive a weak model, but not a weak foundation.

The most painful case was a fraud detection agent that scored a five on context (deep domain knowledge), a five on cost (efficient model routing), a four on choice (open formats), and a one on control. It was technically excellent and operationally ungovernable. The compliance team killed it two weeks before launch. The team had built the right model on the wrong foundation, and nobody had scored that foundation in time.

After the review, the insurer made the four-C score a required field in every AI funding request. Stalls dropped. Honest debates went up. Vanity projects had nowhere to hide.

Why this matters now

Three years ago the conversation was: which model is best. Today the conversation has shifted to: which model is appropriate for which task, what is it allowed to do, how much will it cost at scale, and what happens when a better model comes along next quarter.

Model quality is converging fast. The differences between top models, for most enterprise tasks, are smaller than they have ever been. That means the source of competitive advantage moves elsewhere. It moves to the four C's.

A real-world example

Consider an insurance company rolling out an agent to help adjusters triage claims. Context means the agent knows the difference between a fender bender and a total loss in your policy language, not generic language. Control means the agent can recommend but never auto-approve payouts above a threshold. Cost means simple claims are handled by a small model and only complex ones reach the expensive model. Choice means if a better claims model launches next year, you can swap it in without rewriting your workflows.

Take any one of those C's away and the project either fails, becomes dangerous, or becomes unaffordable.

Going deeper

Why these four and not, say, six or twelve? Because the four C's map cleanly onto the four questions every executive sponsor and every regulator will eventually ask.

Does it understand our business (context)? Will it stay inside the lines (control)? Can we afford it at scale (cost)? Can we change our mind (choice)? Every other concern — accuracy, latency, security, fairness, observability — folds neatly into one of these four. Accuracy without context is fluency. Security without control is paperwork. Latency without cost discipline is a budget bomb. Observability without choice is a vendor's audit trail, not yours.

The four C's also map onto the four organizational muscles you need: data and knowledge work for context, governance and security for control, FinOps and platform engineering for cost, and architecture and procurement for choice. If any of those muscles is weak in your company, you can predict which C will be your weakest. Most leaders can name it before they finish reading this paragraph.

The deepest reason for four is simpler still: it fits on a slide, and it survives translation across functions. A framework that the CFO, the CISO, and the head of platform can all hold in their head at the same time is a framework that gets used. That is worth more than completeness.

What this means for data professionals

Data engineers will spend more time shaping context — the metric layer, the lineage, the business glossary — and less time hand-cranking pipelines.

AI engineers will think less about prompts and more about the trust path: what the agent can see, what it can do, when it must ask a human.

Architects will design for portability, treating today's model as one of many, not the foundation of the stack.

Governance teams move from policy writers to policy operators, with their rules running in the system, not sitting in a PDF.

Data leaders translate the four C's into board language so AI investment stops being a leap of faith.

Architecture thinking

Picture the agent's life in four checkpoints.

First it gathers context: what does this question mean in our business. Second it checks control: am I allowed to take this action for this user. Third it picks the right model for the job, balancing quality and cost. Fourth it records what it did, so you can audit it and, if needed, switch tools later without losing history.

Every serious enterprise AI system, no matter the vendor, is some version of this loop. The teams who design it on purpose move faster than the teams who discover it by accident.

Common mistake

The most common mistake leaders make is to treat AI as a model problem. They pick a vendor, fund a proof of concept, and assume the rest is integration.

The model is rarely where the project fails. Projects fail at the seams: missing meaning, missing guardrails, runaway cost, and the slow realization that they cannot move.

Anti-patterns to watch for
  • The model bake-off as strategy

    Months spent comparing benchmarks while context, control, cost, and choice get zero attention. The winning model still cannot ship.

  • Hidden owners

    No one person can be named for any of the four C's. Decisions stall and nobody knows why.

  • POC tunnel vision

    A proof of concept that ignores production realities (auth, observability, cost ceilings) and is then expected to scale unchanged.

  • The 'we will deal with that later' clause

    Governance, FinOps, or portability deferred to phase two. Phase two arrives as a regulator, a finance review, or a vendor renewal.

  • Treating the framework as a kickoff slide

    Scored once at launch, never revisited. A foundation you do not maintain becomes a foundation that cracks.

A better way

Before picking a model, write one page on each of the four C's for your use case. What context does this agent need. What actions is it allowed to take. What is the budget per thousand interactions. What is the exit plan if you change vendors in two years.

If you cannot answer those four pages, do not build the agent yet. Fix the foundations first, and the model choice becomes almost easy.

How you know it is working
Green flags
  • Every AI project has a named owner for each of the four C's.
  • Funding requests include a current score on each C.
  • Weak scores trigger investment, not denial of the problem.
  • The same framework is used in engineering reviews and board updates.
Red flags
  • AI roadmap conversations turn into model name debates within ten minutes.
  • No one can state the per-interaction cost target of any live agent.
  • Vendor lock-in is something the team jokes about, not plans around.
  • Governance is something that happens to the team, not with it.
A simple checklist
  • Have we written down the context this agent depends on?
  • Have we defined the actions it can and cannot take?
  • Do we have a target cost per interaction?
  • Do we know how we would replace the model in twelve months?
  • Does one named person own each of the four C's for this project?
Mental model

Context is the map. Control is the seatbelt. Cost is the meter. Choice is the exit door.

Quotable line

Enterprise AI does not fail because the model is weak. It fails because the four foundations were never built.

The practical takeaway

Smart models are now table stakes. The teams that win will be the ones who treat context, control, cost, and choice as design problems from day one — not as cleanup work after the demo.

The rest of this book is a tour of those four foundations, what they look like in practice, and how your role inside them is changing.

Reflection questions
  1. Which of the four C's is strongest in your current AI projects? Which is weakest?
  2. Who in your organization owns each C today — or is it nobody?
  3. What is one small experiment you could run this quarter to strengthen the weakest C?
Saved on this device