Agent Harness
The Harness is the core of cascadeflow’s runtime intelligence. It wraps agent execution and makes a decision at every step — should this model call proceed, be switched, or be stopped?What the Harness Does
At every LLM call or tool execution inside an agent loop, the Harness:- Checks hard constraints — budget remaining, compliance allowlist, tool call cap, latency limit, energy limit
- Scores soft dimensions — quality, cost, latency, energy weighted by KPI priorities
- Decides an action —
allow,switch_model,deny_tool, orstop - Records a trace — action, reason, model, step, cost, budget state
observe mode, decisions are recorded but not enforced. In enforce mode, they shape execution in real time.
HarnessConfig — The Full Control Surface
All Harness behavior is configured through a single dataclass:The Three-Tier API
cascadeflow offers three levels of control — use the one that fits your needs:Tier 1: Global Init (Zero-Change)
Tier 2: Scoped Run (Block-Level Control)
Tier 3: Agent Decorator (Per-Agent Policy)
Decision Priority
When the Harness evaluates a step, it follows a strict priority order:| Priority | Check | Action if violated |
|---|---|---|
| 1 | Budget exhausted | stop |
| 2 | Compliance allowlist | switch_model or stop |
| 3 | Tool call cap | deny_tool |
| 4 | Latency limit | switch_model |
| 5 | Energy limit | switch_model |
| 6 | KPI scoring | allow or switch_model |
Six Dimensions at a Glance
| Dimension | Hard cap | Soft scoring | Deep dive |
|---|---|---|---|
| Cost | budget | kpi_weights.cost | Budget Enforcement |
| Quality | — | kpi_weights.quality | KPI Optimization |
| Latency | max_latency_ms | kpi_weights.latency | KPI Optimization |
| Compliance | compliance | — | Compliance Gating |
| Energy | max_energy | kpi_weights.energy | Energy Tracking |
| Tool calls | max_tool_calls | — | Budget Enforcement |
Observe vs Enforce
| Behavior | Observe | Enforce |
|---|---|---|
| Tracks cost, latency, energy | Yes | Yes |
| Records decision trace | Yes | Yes |
| Blocks on budget exceeded | No | Yes |
| Switches non-compliant models | No | Yes |
| Denies tool calls at cap | No | Yes |
| Stops execution | No | Yes |
trace() record applied field | false | true |
observe to validate your policies against real traffic. Switch to enforce when you are confident the rules are correct.