Core Concepts

Seed URLs, route maps, agent runs, grounding tiers, verification, export pipeline, demos, credentials, and traces.

Core Concepts

Cartographer turns a seed URL and natural-language goals into verified browser trajectories and production-grade Playwright tests. This page explains the moving parts — how discovery, execution, verification, and export fit together.

Dashboard: app.molar.it/dashboard/cartographer · Product: cartographer.molar.it


End-to-end flow

  SEED URL
  ┌─────────────┐     ┌──────────────┐     ┌─────────────┐
  │   CRAWL     │────►│  ROUTE MAP   │────►│ DISCOVER    │
  │  (discover) │     │  (pages +    │     │ FLOWS       │
  └─────────────┘     │   edges)     │     └──────┬──────┘
                      └──────────────┘            │
  ┌─────────────┐     ┌──────────────┐     ┌─────────────┐
  │  PLAYWRIGHT │◄────│   EXPORT     │◄────│ AGENT RUN   │
  │  PROJECT    │     │  PIPELINE    │     │ (goal-driven)│
  └─────────────┘     └──────────────┘     └──────┬──────┘
                      ┌──────────────┐             │
                      │   TRACE +    │◄────────────┘
                      │  FINDINGS    │
                      └──────────────┘
StageInputOutput
CrawlSeed URL, depth, scopeRoute map
Agent runGoal, base URL, credentialsTrajectory (ordered steps)
VerificationPer-step assertionsPass/fail + self-heal or HITL
ExportCompleted trajectoryLinted .spec.ts + page objects
TraceEvery runReplay, network, Debugger chat

Projects and seed URLs

A project is the unit of organization for one application under test.

ConceptDescription
Seed URLStarting point for crawls and default base_url for runs
Multiple seed URLsSupported — project hub lets you pick the active base URL
Environment tagstaging, production, or custom — labels runs and exports
Scope allowlistURL prefix guard; crawls and runs refuse out-of-scope navigation
Default grounding tierProject-wide default for new runs (0–2)
Max grounding tierHard ceiling — use 0 for strict privacy (no cloud vision)

Create projects in the dashboard or with cartographer_create_project via MCP.


Crawl and route map

A crawl discovers what exists on your site without pursuing a specific user goal.

Static discovery

The crawl worker performs breadth-first exploration from the seed URL:

  • Follows same-origin links within scope
  • Records HTTP status, DOM hash, URL patterns, depth
  • Captures screenshots and HAR per page when configured
  • Respects robots.txt unless overridden in project settings

Interactive discovery

For single-page apps (SPAs), Cartographer optionally performs vision-assisted clicks to reveal hidden routes:

SettingDefaultPurpose
interactive_crawlonEnable click discovery
interactive_grounding_tier1Vision model for ambiguous controls
interactive_max_clicks_per_pageboundedRetry budget per page

Interactive discovery is why a crawl can find routes that never appear in static HTML alone.

Route map artifacts

Each completed crawl produces a route map:

ArtifactUse
Page tableURL, pattern, depth, status, DOM hash
Route graphNodes = URL patterns, edges = discovered links
ScreenshotsVisual reference per page
HARNetwork waterfall for debugging
Markdown previewReadable page summary
Accessibility treeStructured DOM for grounding

From the route map you can start a run from any page, compare two crawls (added/removed/changed patterns), or Discover flows to batch-enqueue agent runs.

Note:

A crawl answers "what pages exist?" An agent run answers "can a user complete this goal?"


Agent runs

An agent run (labeled Agent runs in the dashboard — not "Runs", which collides with Guard checks) executes a goal in a real browser session.

Planner → actor → verifier loop

Cartographer uses a LangGraph state machine:

  ┌──────────┐    thought + plan    ┌──────────┐
  │ PLANNER  │─────────────────────►│  ACTOR   │
  │ (text)   │                      │ (browser)│
  └────▲─────┘                      └────┬─────┘
       │                                 │
       │         ┌──────────┐            │
       └─────────│ VERIFIER │◄───────────┘
                 │ (assert) │
                 └──────────┘
RoleResponsibility
PlannerReads goal, route context, prior steps; proposes next action
ActorGrounds the action to a selector; executes click, type, navigate, etc.
VerifierConfirms the step succeeded; triggers retry or escalation on failure

Run parameters that matter

ParameterEffect
goalNatural-language objective
max_stepsHard stop to prevent runaway loops
grounding_tierStarting tier for element location
max_grounding_tierEscalation ceiling
browser_adapterWhere the browser runs (see below)
credentials_aliasEncrypted login for authenticated flows
human_presetTiming behavior (default vs careful)
headlessVisible browser when false
record_demoCapture a replayable demonstration
anti_botStealth browser for bot-protected sites

Browser adapters

AdapterWhere it runsWhen to use
cloakbrowserServer-side stealth Chromium poolDefault; bot-protected staging
camoufoxServer-side Firefox variantAlternate fingerprint

Run statuses

StatusMeaning
queuedWaiting for agent worker
runningActively executing steps
pending_humanNeeds approval or manual step (CAPTCHA, MFA, payment)
passed / succeededGoal completed with verified steps
failed / errorUnrecoverable failure
cancelledUser or API cancelled

Live updates stream over WebSocket (/v1/runs/{id}/stream): step_started, step_completed, tier_escalation, finding, artifact_ready, completed.


Grounding ladder (Tiers 0–2)

Grounding is how Cartographer decides which element to click or type into. Cartographer prefers cheap, deterministic methods first and escalates only when necessary.

  Tier 0 ──fail──► Tier 1 ──fail──► Tier 2
  (a11y tree)      (cloud VLM)      (local UI-TARS)

Tier 0 — Accessibility tree (default)

PropertyDetail
InputAccessibility tree + Set-of-Marks overlay
ModelText LLM (e.g. Claude Sonnet)
CostLowest
PrivacyNo screenshots leave your deployment if tier capped at 0
Best forButtons, links, form fields with accessible names

Design goal: Most steps on typical SaaS apps resolve at Tier 0. Live distribution varies by site; check per-run tier analytics in the dashboard.

Tier 1 — Cloud vision (VLM)

PropertyDetail
InputScreenshot + bounding-box candidates
ModelGemini Flash (configurable)
TriggerIcon-only buttons, canvas, ambiguous widgets, crawl interactive clicks
RequiresGOOGLE_API_KEY or hosted vision routing

Tier 1 emits a tier_escalation event with a human-readable reason.

Tier 2 — Local vision (optional)

PropertyDetail
InputScreenshot to on-prem MLX server
ModelUI-TARS (local)
TriggerExplicit grounding_tier=2 or MAX_GROUNDING_TIER=2
RequiresGROUNDING_T2_SERVER_URLmake grounding-serve locally
Use caseAir-gapped deploys, no cloud screenshot policy

Note:

Set max_grounding_tier=0 on a project to enforce privacy mode — no screenshots sent to cloud vision providers. Interactive crawl and Tier 1 escalation are disabled.

Per-run tier analytics

Run detail shows:

  • Pie/bar chart of Tier 0 / 1 / 2 step distribution
  • Table of escalation events with reasons
  • Estimated LLM cost per run

Configure defaults in Settings → AI & models and per-project caps in Project settings → Grounding & browser.


Verification

Every actor step passes through verification before the planner advances.

MechanismDescription
Post-action assertionURL change, element visibility, text content
Self-healRetry with alternate selectors or tier escalation
Failure taxonomyStructured error codes for export and Trace
HITLPause for human when automation cannot proceed

When verification fails after retries, the run moves to failed with a step-level error message, screenshot evidence, and a linked Trace for debugging.


Export pipeline

Export transforms a completed trajectory into a Playwright test project your team can run in CI.

Seven stages

  prune → rank selectors → cluster POMs → harvest assertions
       → generate (LLM) → lint → verify + heal
StagePurpose
PrunerRemove noisy or redundant steps
Selector rankerPrefer stable locators (getByRole, getByTestId)
POM clustererGroup selectors into page objects
Assertion harvesterCollect verifiable oracles from the trajectory
GeneratorWrite .spec.ts and page-object files
LinterEnforce Cartographer style rules (no elementHandle, no waitForTimeout)
Verifier + healerRun generated tests in sandbox; Gemini repairs failures — sandbox E2E in production is still maturing

Export outputs

OutputDescription
*.spec.tsRunnable Playwright tests
Page objectsClustered selectors per route pattern
playwright.config.tsStock Chromium by default
Zip archiveFull project download

Export options

OptionMeaning
stylepoms (page-object model) or flat
target_repoGitHub repo for PR creation
uses_cloakbrowser_in_ciOpt into stealth browser in CI — off by default

Edit & regenerate: provide a natural-language hint after a failed lint/verify; the healer incorporates your guidance.

Note:

Exported tests are designed for your CI — they do not require Cartographer at runtime unless you opt into CloakBrowser in CI for bot-protected targets.


Demos (demonstration learning)

Demos are user-recorded sessions that teach the planner how to handle tricky flows. Record mode is available via the unpacked browser extension.

ConceptDescription
Record modeBrowser extension captures rrweb events, intents, and screenshots
ChunksSegmented intents with URL patterns and replay strategy
ReplayPOST /v1/projects/{id}/demos/{id}/replay creates a new agent run

Record once for MFA, payment, or multi-step wizards; Cartographer reuses the skill on similar goals without re-recording.

See Browser extension for install and consent flows.


Credentials

Authenticated flows use credential aliases — never inline secrets in goals or MCP calls.

PropertyBehavior
StorageEncrypted server-side (Fernet + vault backend in production)
DashboardUsername visible; password never shown after save
RunsSelect alias by name; API receives alias reference only
TOTP seedOptional — for time-based OTP automation
ExtensionPassword fields masked; payment/password actions require approval

Create and manage aliases in Settings → Credentials. Pair with the extension for flows that need a human in the loop.


Traces

Cartographer ships a Trace subsystem in-repo (/traces viewer, /v1/traces API, S3-backed NDJSON). Traces are created when capture/ingest runs — they are not automatically linked from every agent run today unless your pipeline ingests or post-processes artifacts.

FeatureStatus
Trace list (/traces)Shipped
Step timeline + Events logShipped
Video playbackShipped when video.webm exists
Debugger chatShipped (requires Anthropic key / hosted routing)
Share links (/r/[shortId], embed)Shipped
Layer 2 replay APIShipped (POST /v1/traces/{id}/replay)
Layer 2 diff UIAvailable when a child replay exists
DOM replay (rrweb)Available on trace detail at app.molar.it/trace

Open traces from /traces in the web app or Open trace on dashboard run detail. List and fetch via MCP: cartographer_list_traces, cartographer_get_trace.

For cross-product debugging, see Trace.


Passive UX findings

During crawls and runs, Cartographer can emit findings without blocking the main goal:

SourceDetects
axeAccessibility violations
odiffVisual regression vs baseline
vlmSampled UX issues from screenshots

Findings appear in the findings inbox, grouped by severity. Dismiss noise, create visual baselines, or promote findings to export assertions.

Fetch via MCP: cartographer_get_findings.


Discover flows

Discover flows analyzes the latest complete route map and proposes batch agent runs — useful after the first crawl of a large app.

  1. Project hub → Discover flows
  2. Confirmation shows N proposed flows
  3. Progress modal enqueues runs
  4. Monitor in Agent runs list

This bridges discovery (crawl) and execution (runs) without manual goal entry per route.


How concepts map to the dashboard

ConceptDashboard location
Project / seed URL/dashboard/projects, project overview
Route map/dashboard/projects/{id}/explore
Agent run/dashboard/projects/{id}/runs, /runs (legacy list)
Trace/traces (separate TS viewer — not inside /dashboard)
Export/dashboard/projects/{id}/exports
Demo / recording/dashboard/projects/{id}/recordings
Credentials/dashboard/projects/{id}/credentials
Findings/dashboard/projects/{id}/findings

Full information architecture: Dashboard.


PageTopic
Quick startBootstrap and first project
ConfigurationModel routing and env vars
MCP toolsIDE API surface
TroubleshootingStuck crawls, tier escalation, export lint