Skip to main content

Guides

3 min read

Smart Model Routing

Route requests to the right model based on complexity, saving money without sacrificing quality.


The Problem

You're using GPT-4o for everything — including simple classification tasks that GPT-4o-mini handles just as well at 1/10th the cost. Complex reasoning tasks need the expensive model, but most requests don't.

The Solution

Use withModelSelection to route requests based on input length, agent name, or regex patterns:

import {
  withModelSelection,
  byInputLength,
  byAgentName,
} from '@directive-run/ai';

const smartRunner = withModelSelection(runner, [ // See Running Agents (/ai/running-agents) for setup
  // Short inputs -> cheap model
  byInputLength(200, 'gpt-4o-mini'),
  // Specific agents -> specific models
  byAgentName('classifier', 'gpt-4o-mini'),
  byAgentName('reasoner', 'gpt-4o'),
]);

How It Works

  • withModelSelection wraps a runner and overrides the model before each call based on matching rules.
  • Rules are evaluated in order. The first match wins. If no rule matches, the original model is used.
  • byInputLength routes based on character count. Short inputs often need less reasoning power.
  • byAgentName routes based on the agent's name. Assign expensive models to agents that need them.
  • byPattern routes based on regex matches against the input text.
  • Both array and object forms are supported. Pass [...rules] for quick setup or { rules, onModelSelected } when you need the selection callback.

Full Example

A multi-agent system with cost-optimized model routing:

import {
  createMultiAgentOrchestrator,
  withModelSelection,
  byInputLength,
  byAgentName,
  byPattern,
} from '@directive-run/ai';

const smartRunner = withModelSelection(runner, { // See Running Agents (/ai/running-agents) for setup
  rules: [
    // Classification tasks -> cheapest model
    byAgentName('classifier', 'gpt-4o-mini'),
    byAgentName('tagger', 'gpt-4o-mini'),

    // Code-related tasks -> best model
    byPattern(/```[\s\S]*```/, 'gpt-4o'),
    byAgentName('code-reviewer', 'gpt-4o'),

    // Short messages -> cheap model, long messages -> expensive model
    byInputLength(500, 'gpt-4o-mini'),
    byInputLength(Infinity, 'gpt-4o'),
  ],
  onModelSelected: (original, selected) => {
    if (original !== selected) {
      console.log(`Model routed: ${original} -> ${selected}`);
    }
  },
});

const orchestrator = createMultiAgentOrchestrator({
  runner: smartRunner,
  agents: {
    classifier: {
      agent: { name: 'classifier', instructions: 'Classify the intent of user messages.' },
    },
    'code-reviewer': {
      agent: { name: 'code-reviewer', instructions: 'Review code for bugs, security issues, and best practices.' },
    },
    assistant: {
      agent: { name: 'assistant', instructions: 'General-purpose assistant for user queries.' },
    },
  },
});

// Classifier always uses gpt-4o-mini (cheap)
await orchestrator.runAgent('classifier', 'I want to return my order');

// Code reviewer always uses gpt-4o (powerful)
await orchestrator.runAgent('code-reviewer', 'Review this function...');

// General assistant routes by input length
await orchestrator.runAgent('assistant', 'Hi'); // -> gpt-4o-mini
await orchestrator.runAgent('assistant', longDetailedPrompt); // -> gpt-4o
Previous
Test Without LLM Calls

We care about your data. We'll never share your email.

Powered by Directive. This signup uses a Directive module with facts, derivations, constraints, and resolvers – zero useState, zero useEffect. Read how it works

Directive - Constraint-Driven State Management for TypeScript