ADR-004: Class-Based LLMRouter

Status: Accepted Date: 2024-12-01 Deciders: Platform architect

Context

The engine routes LLM requests to different providers based on agent tier (strategy → xAI, writing → Anthropic, analytical → OpenAI, etc.). The original single-user system used module-level functions for routing, which created global state that was difficult to test and impossible to configure per-request.

The multi-tenant platform needed LLM routing that:

Supports per-tenant BYOK (Bring Your Own Key) API key overrides
Is independently testable without making real LLM calls
Handles provider fallback chains (e.g., xAI unavailable → fall back to Claude Sonnet)
Can be instantiated multiple times in parallel tests without state conflicts

Decision

Implement the LLM router as a class (LLMRouter) that is instantiated with configuration and optional API key overrides. Each test creates its own router instance. In production, a singleton instance is shared via state.get_router().

The router accepts api_key_overrides: dict to support BYOK — when a tenant has their own API keys, the router uses those instead of platform keys.

Alternatives Considered

1. Module-level functions with global configuration

The original pattern. Works for single-user but creates problems in multi-tenant: global API keys can't be overridden per-tenant, tests pollute shared state, and parallel execution risks race conditions on global config.

2. Dependency injection framework (e.g., python-inject, dependency-injector)

Would formalize the pattern but adds a dependency for a problem solved by basic class instantiation. Rejected as over-engineering.

3. Configuration-based routing (YAML/JSON)

Route configuration lives in a file, loaded at startup. Doesn't support per-request API key overrides or dynamic fallback chains. Rejected because BYOK requires runtime configuration.

Consequences

Positive:

Each test creates its own LLMRouter with mock providers — zero global state pollution
BYOK support is clean: pass api_key_overrides at instantiation, router uses them transparently
Fallback chains are configurable per-instance
Production singleton via state.get_router() is lazy-loaded and thread-safe

Negative:

Every module that needs LLM access must receive a router instance (parameter threading)
The router instance is threaded through ~14 engine files — more plumbing than module-level functions
Singleton pattern in production means the shared router doesn't support per-request BYOK; instead, BYOK overrides are applied at the run level before execution starts

Risks:

Router instance lifecycle: if a router is created with BYOK keys and then reused for a different tenant's run, keys could leak. Mitigated by creating a fresh router per run when BYOK keys are present.

ADR-004: Class-Based LLMRouter

On this page