Guides
•3 min read
Stream Agent Responses
Stream agent output to the browser in real time instead of waiting for the full response.
The Problem
Your agent takes 15–30 seconds to generate a response. Users see a blank screen, have no way to cancel, and if the connection drops mid-generation the server keeps burning tokens. You need token-by-token streaming with cancellation and safety checks on partial output.
The Solution
Wrap a provider runner with createStreamingRunner, pipe it through createSSETransport in a route handler, and read the stream on the client:
import {
createAgentOrchestrator,
createStreamingRunner,
createSSETransport,
} from '@directive-run/ai';
import { createAnthropicStreamingRunner } from '@directive-run/ai/anthropic';
const streamingRunner = createStreamingRunner(
createAnthropicStreamingRunner({ model: 'claude-sonnet-4-5-20250514' })
);
const orchestrator = createAgentOrchestrator({
runner: streamingRunner,
autoApproveToolCalls: true,
});
// Next.js route handler
export async function POST(request: Request) {
const { input } = await request.json();
const { stream, abort } = orchestrator.runStream('assistant', input, {
signal: request.signal,
});
return createSSETransport(stream).toResponse();
}
How It Works
orchestrator.runStream()returns{ stream, result, abort }. Thestreamis anAsyncIterable<StreamChunk>with 8 chunk types:text,tool_call,tool_result,thinking,error,done,heartbeat, andmetadata.createSSETransportconverts the async iterable to atext/event-streamresponse with automatic heartbeats (every 15s), JSON serialization, and error mapping.signal: request.signalcancels the LLM call when the client disconnects — no wasted tokens on abandoned requests.- Streaming guardrails evaluate partial output every N tokens. Use
createLengthStreamingGuardrailto cap output length andcreatePatternStreamingGuardrailto block dangerous patterns mid-stream.
Full Example
Server route with streaming guardrails, input validation, and error mapping:
import {
createAgentOrchestrator,
createStreamingRunner,
createSSETransport,
createLengthStreamingGuardrail,
createPatternStreamingGuardrail,
} from '@directive-run/ai';
import { createAnthropicStreamingRunner } from '@directive-run/ai/anthropic';
const streamingRunner = createStreamingRunner(
createAnthropicStreamingRunner({ model: 'claude-sonnet-4-5-20250514' })
);
const orchestrator = createAgentOrchestrator({
runner: streamingRunner,
autoApproveToolCalls: true,
streamingGuardrails: [
createLengthStreamingGuardrail({ maxTokens: 4096 }),
createPatternStreamingGuardrail({
patterns: [/\bpassword\b/i, /\bsecret_key\b/i],
action: 'truncate',
}),
],
});
// POST /api/chat
export async function POST(request: Request) {
const body = await request.json();
const input = typeof body.input === 'string' ? body.input.trim() : '';
if (!input) {
return new Response(JSON.stringify({ error: 'input required' }), {
status: 400,
});
}
try {
const { stream } = orchestrator.runStream('assistant', input, {
signal: request.signal,
});
return createSSETransport(stream, {
heartbeatMs: 15000,
onError: (error) => ({
type: 'error',
message: error.message.includes('budget')
? 'Token limit reached'
: 'Generation failed',
}),
}).toResponse();
} catch (error) {
return new Response(JSON.stringify({ error: 'Stream failed' }), {
status: 500,
});
}
}
Client consuming the stream with abort support:
const controller = new AbortController();
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ input: 'Explain quantum computing' }),
signal: controller.signal,
});
const reader = response.body!.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) {
break;
}
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop()!;
for (const line of lines) {
if (!line.startsWith('data: ')) {
continue;
}
const data = JSON.parse(line.slice(6));
if (data.type === 'text') {
appendToUI(data.content);
} else if (data.type === 'error') {
showError(data.message);
} else if (data.type === 'done') {
showComplete(data.usage);
}
}
}
// Cancel button
cancelButton.onclick = () => controller.abort();
Why not EventSource?
createSSETransport uses custom JSON events with metadata and error types. The native EventSource API only supports GET requests and plain text events. Use fetch with a streaming reader for full control over the request body, headers, and typed events.
Related
- Streaming reference —
createStreamingRunnerand chunk types - SSE Transport reference —
createSSETransportoptions - Guardrails — streaming guardrail configuration
- Handle Agent Errors guide — retry and fallback for failed streams
- Control AI Costs guide — budget limits during streaming

