The Agent Runtime Intelligence Layer
cascadeflow is infrastructure that sits inside AI agent execution and continuously optimizes outcomes across business and technical constraints in real time. This is not another model router. It is a decision system inside the agent loop. Every model call, tool call, and sub-agent handoff can be measured, scored, and steered — where cost, delay, and failure actually happen.Get Started
Install, observe, enforce, and ship to production in minutes.
Why cascadeflow
The business case for inside-the-loop agent intelligence.
Install
What Makes This Different
| External Proxy | cascadeflow | |
|---|---|---|
| Where it runs | HTTP boundary | Inside the agent loop |
| What it sees | Request/response pairs | Step count, budget, tool history, quality, domain, business context |
| What it optimizes | Cost | Cost + latency + quality + budget + compliance + energy |
| What it does | Observes | allow, switch_model, deny_tool, stop |
| Latency overhead | 40-60ms per call | <1ms in-process |
| In 10-step agent loop | 400-600ms added | ~0ms added |
Three Lines to Govern Any Agent
Six Dimensions, One Decision
Every agent step is scored across six dimensions simultaneously:| Dimension | What it controls | Example |
|---|---|---|
| Cost | USD per LLM call from pricing table | Budget cap of $0.50 per run |
| Latency | Wall-clock milliseconds per call | Max 2000ms per call |
| Quality | Model quality priors for routing | 60% weight on quality KPI |
| Budget | Cumulative spend tracking and caps | Per-user daily limits |
| Compliance | Model allowlists per regulation | GDPR: only approved models |
| Energy | Compute-intensity coefficients | Carbon-aware model selection |
Works With Every Major Framework
| Framework | Python | TypeScript | Type |
|---|---|---|---|
| LangChain / LangGraph | cascadeflow[langchain] | @cascadeflow/langchain | Callback handler |
| OpenAI Agents SDK | cascadeflow[openai-agents] | — | ModelProvider |
| CrewAI | cascadeflow[crewai] | — | llm_hooks |
| Google ADK | cascadeflow[google-adk] | — | BasePlugin |
| n8n | — | @cascadeflow/n8n-nodes-cascadeflow | Community node |
| Vercel AI SDK | — | @cascadeflow/vercel-ai | Middleware |
Explore
Agent Harness
Configure budget, compliance, KPI, and energy controls.
Agent Loop
How cascadeflow operates inside multi-step agent execution.
Examples
42+ Python and 33+ TypeScript examples on GitHub.
Integrations
LangChain, OpenAI Agents, CrewAI, Google ADK, n8n, Vercel AI.
API Reference
Full Python and TypeScript API documentation.
For Coding Agents
Canonical facts, repo map, and implementation entry points.