Observe Mode — Zero-Change Visibility

Observe mode tracks every LLM call without blocking or modifying any behavior. This is the safest way to start: no enforcement, no model switching, just metrics.

Prerequisites

cascadeflow installed (Installation)
At least one provider API key set (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)

Add one line

Add cascadeflow.init(mode="observe") before any LLM calls in your application:

import cascadeflow

cascadeflow.init(mode="observe")

# Your existing code — unchanged
from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is cascadeflow?"}],
)
print(response.choices[0].message.content)

Every call is now tracked. Nothing is blocked or changed.

See what you spend

Wrap a block with cascadeflow.run() to get aggregate metrics:

import cascadeflow

cascadeflow.init(mode="observe")

with cascadeflow.run() as session:
    # Run your agent, chain, or direct LLM calls
    response1 = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Summarize this document"}],
    )
    response2 = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Analyze the sentiment"}],
    )

    summary = session.summary()
    print(f"Total cost:    ${summary['cost_total']:.4f}")
    print(f"LLM calls:     {summary['steps']}")
    print(f"Total latency: {summary['latency_total_ms']:.0f}ms")
    print(f"Energy used:   {summary['energy_used']:.1f} units")

Read the decision trace

Even in observe mode, cascadeflow records what it would have done:

for record in session.trace():
    print(f"Step {record['step']}: {record['action']} — {record['reason']}")
    print(f"  Model: {record['model']}, Cost so far: ${record['cost_total']:.4f}")
    print(f"  Applied: {record['applied']}")  # Always False in observe mode

This lets you audit compliance violations, budget overruns, and routing decisions before turning on enforcement.

TypeScript

import { CascadeAgent } from '@cascadeflow/core';

const agent = new CascadeAgent({
  models: [
    { name: 'gpt-4o-mini', provider: 'openai', cost: 0.000375 },
    { name: 'gpt-4o', provider: 'openai', cost: 0.00625 },
  ],
});

const result = await agent.run('What is TypeScript?');
console.log(`Model: ${result.modelUsed}`);
console.log(`Cost: $${result.totalCost}`);
console.log(`Saved: ${result.savingsPercentage}%`);

Run this example: examples/basic_usage.py | examples/cost_tracking.py

What You Learn in Observe Mode

How much each agent run actually costs
Which models are called and how often
Where latency accumulates across steps
Which calls would violate compliance policies
Whether budget caps would have triggered

Next Step

Ready to enforce constraints? Add budget enforcement →

Overview

Getting Started

Core Concepts

Harness

Integrations

Guides

Resources

Quickstart: Observe Mode

Observe Mode — Zero-Change Visibility

Prerequisites

TypeScript

What You Learn in Observe Mode

Next Step

​Observe Mode — Zero-Change Visibility

​Prerequisites

​TypeScript

​What You Learn in Observe Mode

​Next Step

Observe Mode — Zero-Change Visibility

Prerequisites

TypeScript

What You Learn in Observe Mode

Next Step