LLM gateway

Per-capability model routing for Cartographer, Guard, and Trace — molar-managed and BYOK.

LLM gateway

The LLM gateway routes AI workloads to the right model per capability key. Cartographer planners, VLM grounding, export generators, Guard Debugger, and Trace AI features all call through it — never with raw API keys in browser or worker payloads.

Production: hosted gateway behind https://api.molar.it (dashboard API proxy)

Local dev: http://127.0.0.1:8787

Packages: @molar/llm-core, @molar/llm-gateway

Why a gateway

ConcernHow gateway helps
SecurityWorkers send integration IDs, not plaintext keys (SEC-05)
RoutingDifferent models per capability (planner vs VLM vs embeddings)
MeteringDebugger and LLM-heavy tools bill debugger_tokens_* meters
BYOKOrg-owned provider keys via integration IDs

Capability keys

KeyTypical use
plannerCartographer crawl planning
grounding_tier0DOM / accessibility grounding
grounding_tier1_vlmVision-language grounding
export_generatorPlaywright test export
export_healerHeal broken selectors
passive_ux_vlmUX quality signals
embeddingsSemantic search / clustering
stagehand_plannerStagehand-style act() planning
stagehand_vlmStagehand VLM steps

Routing configuration

Org or run settings carry model_routing — integration IDs only:

{
  "model_routing": {
    "planner": {
      "provider_integration_id": "molar-managed",
      "model": "claude-sonnet-4-5-20251022"
    },
    "grounding_tier1_vlm": {
      "provider_integration_id": "byok-openai-prod",
      "model": "gpt-4o"
    }
  }
}
provider_integration_idMeaning
molar-managedMolar-hosted keys; subject to plan debugger budget and spend cap
byok-*Your stored provider credential — see BYOK

OpenAI-compatible API

The gateway exposes an OpenAI-compatible HTTP API for workers:

  • Cartographer FastAPI / Python workers → MOLAR_LLM_BASE_URL
  • TypeScript services → @molar/llm-core route resolution

Service-to-service auth uses MOLAR_LLM_SERVICE_KEY (not your org PAT).

Browser auth path

Browser sessions never hit the gateway directly:

  1. User session via Better Auth (HttpOnly cookie)
  2. Next.js dashboard API at /api/cartographer/v1/* mints short-lived internal JWT (aud: cartographer-bff)
  3. FastAPI workers call gateway with service key

See Cartographer auth docs for the full dashboard API flow.

Local development

# Gateway
cd molar-platform/apps/llm-gateway && pnpm dev   # :8787

# Cartographer workers
export MOLAR_LLM_BASE_URL=http://localhost:8787
export MOLAR_LLM_SERVICE_KEY=dev-molar-llm-service-key
export BFF_JWT_SECRET=dev-bff-jwt-secret-min-32-characters-long

A full Cartographer self-host stack (docker compose up in the bundle) starts LiteLLM proxy (:4000), LLM gateway (:8787), and workers together.

Dashboard visibility

Settings and health probes surface gateway status:

  • GET /api/v1/llm/metrics — latency and error rates (dashboard services health)
  • Spend against debugger budget on Billing → Usage

Spend caps & errors

StatusMeaning
402 org LLM spend cap exceededHosted-model budget exhausted
FixAdd BYOK integration or raise cap in billing settings

BYOK routes bypass the org hosted spend cap but still count toward your provider bill.