Skip to main content

The Agent Runtime Intelligence Layer

cascadeflow is infrastructure that sits inside AI agent execution and continuously optimizes outcomes across business and technical constraints in real time. This is not another model router. It is a decision system inside the agent loop. Every model call, tool call, and sub-agent handoff can be measured, scored, and steered — where cost, delay, and failure actually happen.

Install

pip install cascadeflow
import cascadeflow

cascadeflow.init(mode="observe")
# Every OpenAI and Anthropic SDK call is now tracked — zero code changes.

What Makes This Different

External Proxycascadeflow
Where it runsHTTP boundaryInside the agent loop
What it seesRequest/response pairsStep count, budget, tool history, quality, domain, business context
What it optimizesCostCost + latency + quality + budget + compliance + energy
What it doesObservesallow, switch_model, deny_tool, stop
Latency overhead40-60ms per call<1ms in-process
In 10-step agent loop400-600ms added~0ms added

Three Lines to Govern Any Agent

import cascadeflow
cascadeflow.init(mode="observe")
# All LLM calls tracked. No blocking, no changes.

Six Dimensions, One Decision

Every agent step is scored across six dimensions simultaneously:
DimensionWhat it controlsExample
CostUSD per LLM call from pricing tableBudget cap of $0.50 per run
LatencyWall-clock milliseconds per callMax 2000ms per call
QualityModel quality priors for routing60% weight on quality KPI
BudgetCumulative spend tracking and capsPer-user daily limits
ComplianceModel allowlists per regulationGDPR: only approved models
EnergyCompute-intensity coefficientsCarbon-aware model selection

Works With Every Major Framework

FrameworkPythonTypeScriptType
LangChain / LangGraphcascadeflow[langchain]@cascadeflow/langchainCallback handler
OpenAI Agents SDKcascadeflow[openai-agents]ModelProvider
CrewAIcascadeflow[crewai]llm_hooks
Google ADKcascadeflow[google-adk]BasePlugin
n8n@cascadeflow/n8n-nodes-cascadeflowCommunity node
Vercel AI SDK@cascadeflow/vercel-aiMiddleware

Explore