The Gap Between Thinking and Doing
Every modern AI agent architecture follows the same pattern: a language model reasons about what to do, then some execution layer does it. The problem is that most teams treat the execution layer as a thin wrapper — a simple function call dispatcher. This is a fundamental architectural mistake that leads to non-determinism, drift, and failure at scale.
Consider what happens when an LLM decides to deploy a service. The model outputs something like {"action": "deploy", "target": "prod", "version": "2.4.1"}. In a naive implementation, this gets dispatched directly to your deployment system. But what validates that this sequence of events is safe? What ensures the model didn’t skip the staging step? What guarantees the same deployment will happen the same way tomorrow?
Why Thin Wrappers Fail
The thin-wrapper approach treats agent execution like a REST API: request in, action out. But agent executions are not individual requests — they are directed acyclic graphs (DAGs) of dependent operations. Each step may produce side effects that constrain what can happen next. A deployment must follow a build. A database migration must precede a schema-dependent query.
When you let an LLM directly orchestrate these dependencies, you inherit all the non-determinism of the model. The model might choose different orderings on different runs. It might hallucinate intermediate steps. It might skip validation entirely because its training data included examples where validation was optional.
The LLM should decide what to do. A deterministic control plane should decide how to do it, in what order, with what validations.
Execution Graphs as First-Class Citizens
The solution is to model agent executions as explicit graphs. Each node represents an atomic action with defined inputs, outputs, preconditions, and postconditions. Edges represent data flow and dependency constraints. The graph is validated before execution begins, and each step is verified against its postconditions before proceeding to the next.
This is not a new concept in systems engineering. Kubernetes controllers, Terraform plans, and database transaction managers all implement exactly this pattern. The innovation is applying it to the AI agent domain, where the “controller” is an LLM that produces plans with inherent uncertainty.
// Define the execution graph
const deployGraph = graph({
steps: [
{ id: "build", action: buildImage, pre: [codeCommitted] },
{ id: "test", action: runTests, pre: [buildSucceeded], after: "build" },
{ id: "stage", action: deployStage, pre: [testsGreen], after: "test" },
{ id: "validate", action: smokeTest, pre: [stageHealthy], after: "stage" },
{ id: "promote", action: deployProd, pre: [smokeGreen], after: "validate" },
],
rollback: rollbackPipeline,
});
// Agent proposes, graph validates and executes
const result = await sudoexec.run(deployGraph, agentPlan);
The Control Plane Pattern
A proper agent control plane provides three guarantees that a thin wrapper cannot:
- ✓ Structural validity — The execution plan must form a valid DAG. No cycles, no missing dependencies, no orphaned steps.
- ✓ Semantic validity — Each step’s preconditions must be satisfiable by the outputs of its predecessors. The graph must be type-safe end to end.
- ✓ Execution fidelity — Once validated, the graph executes identically every time. Same inputs, same outputs, same side effects. This is what makes replay possible.
Without these guarantees, you don’t have an agent system. You have a chatbot with a shell prompt. The distinction matters when your agent is managing production infrastructure, processing financial transactions, or orchestrating multi-service deployments.
Implications for Agent Architecture
Adopting a control plane pattern fundamentally changes how you build agent systems. The LLM’s role shrinks to intent extraction and plan proposal. The control plane handles validation, ordering, execution, rollback, and observability. This separation of concerns makes each component independently testable, deployable, and scalable.
Most importantly, it makes your agent system auditable. Every execution produces a complete trace: what was proposed, what was validated, what was executed, and what the outcome was. In regulated industries, this is not optional — it is the price of admission.