Six Dimensions
| Dimension | What it measures | Hard cap | Soft scoring |
|---|---|---|---|
| Cost | Estimated USD from the pricing table | budget | kpi_weights.cost |
| Latency | Wall-clock milliseconds per LLM call | max_latency_ms | kpi_weights.latency |
| Quality | Model quality priors (0-1 score) | — | kpi_weights.quality |
| Tool calls | Count of tool/function calls | max_tool_calls | — |
| Energy | Compute-intensity coefficient | max_energy | kpi_weights.energy |
| Compliance | Model allowlist per regulation | compliance | — |
HarnessConfig
All harness behavior is configured through a single dataclass:Activation
Decision Flow
For each LLM call or tool execution:- Record model, step number, cumulative cost, latency, energy
- Check compliance — is the model in the allowlist for the configured regulation?
- Check hard caps — budget, tool calls, latency, energy
- Score KPI dimensions — quality, cost, latency, energy weighted by
kpi_weights - Decide action —
allow,switch_model,deny_tool, orstop - Enforce or log — enforce in
enforcemode, log only inobservemode - Append trace — full decision record for auditability
Supported Models
The harness includes a built-in pricing table for 18 models across OpenAI, Anthropic, and Google. Unknown models are resolved via fuzzy matching (e.g.gpt-5-mini matches even before official pricing is announced).
See Energy Tracking for the full pricing and energy coefficients table.