Embed a query, search a chunk store, and build context-enriched agent inputs automatically.

Overview

createRAGEnricher pairs an embedder with a storage backend to retrieve relevant document chunks by cosine similarity and assemble them into a single enriched input string. The enricher is storage-agnostic – plug in the built-in JSON file store or bring your own vector database.

import {
  createRAGEnricher,
  createJSONFileStore,
  createOpenAIEmbedder,
} from '@directive-run/ai';

const enricher = createRAGEnricher({
  embedder: createOpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY! }),
  storage: createJSONFileStore({ filePath: './embeddings.json' }),
  topK: 5,
  minSimilarity: 0.3,
});

API

`createRAGEnricher(config)`

Returns a RAGEnricher with two methods: retrieve() and enrich().

`RAGEnricherConfig`

Property	Type	Default	Description
`embedder`	`EmbedderFn`	required	Function to generate query embeddings
`storage`	`RAGStorage`	required	Storage backend for document chunks
`topK`	`number`	`5`	Number of top results to return
`minSimilarity`	`number`	`0.3`	Minimum cosine similarity to include (clamped to [0, 1])
`formatChunk`	`(chunk, similarity) => string`	built-in	Custom chunk formatter
`formatContext`	`(formattedChunks, query) => string`	built-in	Custom context block formatter
`onError`	`(error: Error) => void`	–	Error callback (embedder/storage errors are non-fatal by default)

`retrieve(query, topK?)`

Returns the top-K chunks sorted by descending cosine similarity, each annotated with a similarity score. Uses the storage backend's optimized search() method when available, otherwise performs a full scan via getChunks(). Errors from the embedder or storage propagate to the caller; use enrich() for automatic error handling via the onError callback.

const chunks = await enricher.retrieve('How do constraints work?');
// => Array<RAGChunk & { similarity: number }>

for (const chunk of chunks) {
  console.log(`${chunk.id} (${chunk.similarity.toFixed(2)}): ${chunk.content.slice(0, 80)}`);
}

`enrich(input, options?)`

Calls retrieve() internally, applies an optional filter, formats the results, and assembles all parts (prefix, context, history, query) separated by --- delimiters.

const enrichedInput = await enricher.enrich('How do constraints work?', {
  prefix: 'User is viewing: /docs/constraints',
  history: [
    { role: 'user', content: 'What is Directive?' },
    { role: 'assistant', content: 'A constraint-driven runtime for TypeScript.' },
  ],
  topK: 3,
  filter: (chunk) => chunk.metadata.section === 'constraints',
});

// Pass the enriched input to any agent runner — returns RunResult<T>
const result = await stack.run('docs-qa', enrichedInput);

`RAGEnrichOptions`

Property	Type	Description
`prefix`	`string`	Prefix line prepended to the enriched input (e.g. current page URL)
`history`	`Array<{ role: string, content: string }>`	Conversation history included between context and query
`topK`	`number`	Per-call override for the number of results
`filter`	`(chunk: RAGChunk) => boolean`	Filter chunks after retrieval but before formatting

RAGStorage Interface

Any object that implements RAGStorage can be used as the storage backend:

interface RAGStorage {
  /** Return all stored chunks */
  getChunks(): Promise<RAGChunk[]>;
  /** Return the total number of chunks */
  size(): Promise<number>;
  /** Optional: optimized vector search (bypasses full getChunks scan) */
  search?(query: Embedding, topK: number, minSimilarity: number):
    Promise<Array<RAGChunk & { similarity: number }>>;
  /** Reload storage (clear cache, re-read from source) */
  reload?(): Promise<void>;
  /** Dispose of resources */
  dispose?(): void;
}

Each RAGChunk has the following shape:

interface RAGChunk {
  id: string;
  content: string;
  embedding: Embedding;       // Embedding is number[] (vector of floats)
  metadata: Record<string, unknown>;
}

Metadata Conventions

The metadata field is intentionally untyped to support any use case. The built-in defaultFormatChunk recognizes these optional fields:

Field	Type	Used for
`title`	`string`	Chunk header (e.g. page title)
`section`	`string`	Sub-section label
`url`	`string`	Link to the source document

When all three are present, the default formatter renders [Title – Section](url) as the chunk header. You can add any additional metadata fields for use in custom formatters or filters.

If your storage backend provides a search() method (e.g. a vector database with built-in ANN search), the enricher will use it instead of loading all chunks and scanning in memory.

Built-in: JSON File Store

createJSONFileStore creates a storage backend that reads chunks from a JSON file on disk. The file is lazy-loaded and cached in memory.

import { createJSONFileStore } from '@directive-run/ai';

const storage = createJSONFileStore({
  filePath: './embeddings.json',
  ttlMs: 60_000,  // Re-read the file every 60 seconds
  mapEntry: (raw) => ({
    id: raw.slug as string,
    content: raw.body as string,
    embedding: raw.vec as number[],
    metadata: { title: raw.title, url: raw.href },
  }),
});

`JSONFileStoreOptions`

Property	Type	Default	Description
`filePath`	`string`	required	Absolute or relative path to the JSON embeddings file
`mapEntry`	`(entry) => RAGChunk`	identity	Transform raw JSON entries into `RAGChunk` objects
`ttlMs`	`number`	`0`	Cache TTL in ms. `0` means cache forever until `reload()` or `dispose()`

The store uses dynamic import('node:fs') so it can be imported isomorphically without breaking browser bundles. Load failures log a dev-mode warning and return an empty chunk array.

Built-in: OpenAI Embedder

createOpenAIEmbedder creates an EmbedderFn that calls the OpenAI embeddings API directly (no SDK dependency).

import { createOpenAIEmbedder } from '@directive-run/ai';

const embedder = createOpenAIEmbedder({
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',
  dimensions: 1536,
});

`OpenAIEmbedderOptions`

Property	Type	Default	Description
`apiKey`	`string`	required	OpenAI API key
`model`	`string`	`'text-embedding-3-small'`	Embedding model name
`dimensions`	`number`	`1536`	Output embedding dimensions
`baseURL`	`string`	`'https://api.openai.com/v1'`	API base URL (for proxies or compatible APIs)
`fetch`	`typeof fetch`	`globalThis.fetch`	Custom fetch implementation

Requests time out after 30 seconds via AbortSignal.timeout.

Custom Formatters

Override how chunks and the final context block are formatted:

const enricher = createRAGEnricher({
  embedder,
  storage,
  formatChunk: (chunk, similarity) =>
    `## ${chunk.metadata.title} (${(similarity * 100).toFixed(0)}% match)\n${chunk.content}`,
  formatContext: (formattedChunks, query) =>
    formattedChunks.length > 0
      ? `Use the following documentation to answer "${query}":\n\n${formattedChunks.join('\n\n')}`
      : 'No relevant documentation found.',
});

The default formatChunk renders [Title – Section](url) when all three metadata fields are present, otherwise falls back to title alone or chunk.id. The default formatContext wraps the chunks under a "Relevant documentation context:" header.

Integration with AgentStack and SSE Transport

Combine the RAG enricher with AgentStack and SSE Transport for a full server-side chat pipeline:

import {
  createAgentStack,
  createAnthropicRunner,
  createAnthropicStreamingRunner,
  createRAGEnricher,
  createJSONFileStore,
  createOpenAIEmbedder,
  createSSETransport,
} from '@directive-run/ai';

// 1. Enricher
const enricher = createRAGEnricher({
  embedder: createOpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY! }),
  storage: createJSONFileStore({ filePath: './embeddings.json' }),
});

// 2. Agent stack
const stack = createAgentStack({
  runner: createAnthropicRunner({ apiKey: process.env.ANTHROPIC_API_KEY! }),
  streaming: {
    runner: createAnthropicStreamingRunner({ apiKey: process.env.ANTHROPIC_API_KEY! }),
  },
  agents: {
    'docs-qa': {
      agent: {
        name: 'docs-qa',
        instructions: 'Answer questions using the provided documentation context.',
        model: 'claude-sonnet-4-5-20250929',
      },
      capabilities: ['chat'],
    },
  },
});

// 3. SSE transport
const transport = createSSETransport({ maxResponseChars: 10_000 });

// 4. Route handler (Next.js App Router)
export async function POST(request: Request) {
  const { message, history } = await request.json();
  const enrichedInput = await enricher.enrich(message, { history });

  return transport.toResponse(stack, 'docs-qa', enrichedInput);
}

Next Steps

SSE Transport – Stream enriched responses over HTTP
Agent Stack – Compose all AI features in one factory
Guardrails – Input/output validation and safety
Streaming – Real-time token streaming and backpressure

RAG Enricher