← Back to blog
article
security·2026-03-01·6 min read

The Hallucination Firewall

How execution graph validation prevents AI agents from inventing actions that do not exist.

When AI Invents Reality

Hallucination in conversational AI is annoying. Hallucination in agentic AI is dangerous. When a chatbot invents a fact, a human can verify it. When an agent invents an action — calls an API that does not exist, references a database table that was never created, deploys to a server that is not in the infrastructure — the consequences are immediate and potentially catastrophic.

The root cause is that LLMs do not have a ground truth model of your infrastructure. They generate plausible action sequences based on patterns in their training data. If the model has seen enough deployment scripts that reference a “staging” environment, it will confidently reference your staging environment even if you do not have one. If it has seen database migration patterns, it will generate migration commands for tables that do not exist in your schema.

The Firewall Architecture

The hallucination firewall is a validation layer that sits between the LLM’s proposed actions and their execution. Every action proposed by the agent is validated against a registry of known, valid actions before it is allowed to execute. Actions that are not in the registry are rejected outright.

Three layers of validation
  • L1 Action Registry — Does this action exist? Is it a known, registered action in the system? Rejects hallucinated API calls, phantom services, and non-existent infrastructure targets.
  • L2 Parameter Validation — Are the parameters valid for this action? Type checking, range validation, enum enforcement. Catches hallucinated parameter values and invalid configurations.
  • L3 Context Validation — Is this action valid in the current context? Can this action be performed given the current state of the system? Prevents actions that are valid in isolation but invalid in sequence.
// Agent proposes an action
const proposal = {
  action: "deploy_to_staging_v2",  // hallucinated action
  params: { version: "latest", region: "us-west-3" }
};

// Firewall validates
const result = await firewall.validate(proposal);
// {
//   valid: false,
//   errors: [
//     { layer: "L1", msg: "Action 'deploy_to_staging_v2' not found in registry" },
//     { layer: "L1", suggestion: "Did you mean 'deploy_to_staging'?" },
//     { layer: "L2", msg: "Region 'us-west-3' is not a valid region" },
//   ]
// }

Action Registry Design

The action registry is the foundation of the firewall. It is a typed, versioned catalog of every action the agent system can perform. Each action entry defines the action name and version, input parameters with types and constraints, preconditions that must be true before execution, postconditions that will be true after execution, side effects the action produces, and the rollback procedure if the action needs to be undone.

The registry is not generated by the LLM — it is authored by engineers and reviewed like code. It is the single source of truth for what the agent system can do. The LLM can propose actions, but only actions in the registry can be executed.

Feedback Loops and Learning

When the firewall rejects a proposed action, the rejection is fed back to the agent with a structured explanation: what was wrong, why it was wrong, and what the closest valid alternative is. This creates a feedback loop that helps the LLM learn the boundaries of the system without requiring retraining.

Over time, agents that operate behind a hallucination firewall produce fewer invalid proposals because the rejection context becomes part of the conversation history. The agent learns that certain patterns are rejected and adjusts its proposals accordingly.

Security Implications

The hallucination firewall is also a security boundary. In adversarial scenarios — prompt injection, data poisoning, or model compromise — the firewall prevents the agent from executing actions that are not in the registry. Even if an attacker convinces the LLM to propose a malicious action, the execution layer rejects it because it is not a registered action.

This defense-in-depth approach means that compromising the LLM is not sufficient to compromise the system. The attacker must also compromise the action registry, which is a standard code artifact protected by version control, code review, and access controls.