Core Concepts
Cartographer turns a seed URL and natural-language goals into verified browser trajectories and production-grade Playwright tests. This page explains the moving parts — how discovery, execution, verification, and export fit together.
Dashboard: app.molar.it/dashboard/cartographer · Product: cartographer.molar.it
End-to-end flow
SEED URL
│
▼
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ CRAWL │────►│ ROUTE MAP │────►│ DISCOVER │
│ (discover) │ │ (pages + │ │ FLOWS │
└─────────────┘ │ edges) │ └──────┬──────┘
└──────────────┘ │
▼
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ PLAYWRIGHT │◄────│ EXPORT │◄────│ AGENT RUN │
│ PROJECT │ │ PIPELINE │ │ (goal-driven)│
└─────────────┘ └──────────────┘ └──────┬──────┘
│
┌──────────────┐ │
│ TRACE + │◄────────────┘
│ FINDINGS │
└──────────────┘
| Stage | Input | Output |
|---|---|---|
| Crawl | Seed URL, depth, scope | Route map |
| Agent run | Goal, base URL, credentials | Trajectory (ordered steps) |
| Verification | Per-step assertions | Pass/fail + self-heal or HITL |
| Export | Completed trajectory | Linted .spec.ts + page objects |
| Trace | Every run | Replay, network, Debugger chat |
Projects and seed URLs
A project is the unit of organization for one application under test.
| Concept | Description |
|---|---|
| Seed URL | Starting point for crawls and default base_url for runs |
| Multiple seed URLs | Supported — project hub lets you pick the active base URL |
| Environment tag | staging, production, or custom — labels runs and exports |
| Scope allowlist | URL prefix guard; crawls and runs refuse out-of-scope navigation |
| Default grounding tier | Project-wide default for new runs (0–2) |
| Max grounding tier | Hard ceiling — use 0 for strict privacy (no cloud vision) |
Create projects in the dashboard or with cartographer_create_project via MCP.
Crawl and route map
A crawl discovers what exists on your site without pursuing a specific user goal.
Static discovery
The crawl worker performs breadth-first exploration from the seed URL:
- Follows same-origin links within scope
- Records HTTP status, DOM hash, URL patterns, depth
- Captures screenshots and HAR per page when configured
- Respects
robots.txtunless overridden in project settings
Interactive discovery
For single-page apps (SPAs), Cartographer optionally performs vision-assisted clicks to reveal hidden routes:
| Setting | Default | Purpose |
|---|---|---|
interactive_crawl | on | Enable click discovery |
interactive_grounding_tier | 1 | Vision model for ambiguous controls |
interactive_max_clicks_per_page | bounded | Retry budget per page |
Interactive discovery is why a crawl can find routes that never appear in static HTML alone.
Route map artifacts
Each completed crawl produces a route map:
| Artifact | Use |
|---|---|
| Page table | URL, pattern, depth, status, DOM hash |
| Route graph | Nodes = URL patterns, edges = discovered links |
| Screenshots | Visual reference per page |
| HAR | Network waterfall for debugging |
| Markdown preview | Readable page summary |
| Accessibility tree | Structured DOM for grounding |
From the route map you can start a run from any page, compare two crawls (added/removed/changed patterns), or Discover flows to batch-enqueue agent runs.
Note:
A crawl answers "what pages exist?" An agent run answers "can a user complete this goal?"
Agent runs
An agent run (labeled Agent runs in the dashboard — not "Runs", which collides with Guard checks) executes a goal in a real browser session.
Planner → actor → verifier loop
Cartographer uses a LangGraph state machine:
┌──────────┐ thought + plan ┌──────────┐
│ PLANNER │─────────────────────►│ ACTOR │
│ (text) │ │ (browser)│
└────▲─────┘ └────┬─────┘
│ │
│ ┌──────────┐ │
└─────────│ VERIFIER │◄───────────┘
│ (assert) │
└──────────┘
| Role | Responsibility |
|---|---|
| Planner | Reads goal, route context, prior steps; proposes next action |
| Actor | Grounds the action to a selector; executes click, type, navigate, etc. |
| Verifier | Confirms the step succeeded; triggers retry or escalation on failure |
Run parameters that matter
| Parameter | Effect |
|---|---|
goal | Natural-language objective |
max_steps | Hard stop to prevent runaway loops |
grounding_tier | Starting tier for element location |
max_grounding_tier | Escalation ceiling |
browser_adapter | Where the browser runs (see below) |
credentials_alias | Encrypted login for authenticated flows |
human_preset | Timing behavior (default vs careful) |
headless | Visible browser when false |
record_demo | Capture a replayable demonstration |
anti_bot | Stealth browser for bot-protected sites |
Browser adapters
| Adapter | Where it runs | When to use |
|---|---|---|
| cloakbrowser | Server-side stealth Chromium pool | Default; bot-protected staging |
| camoufox | Server-side Firefox variant | Alternate fingerprint |
Run statuses
| Status | Meaning |
|---|---|
queued | Waiting for agent worker |
running | Actively executing steps |
pending_human | Needs approval or manual step (CAPTCHA, MFA, payment) |
passed / succeeded | Goal completed with verified steps |
failed / error | Unrecoverable failure |
cancelled | User or API cancelled |
Live updates stream over WebSocket (/v1/runs/{id}/stream): step_started, step_completed, tier_escalation, finding, artifact_ready, completed.
Grounding ladder (Tiers 0–2)
Grounding is how Cartographer decides which element to click or type into. Cartographer prefers cheap, deterministic methods first and escalates only when necessary.
Tier 0 ──fail──► Tier 1 ──fail──► Tier 2
(a11y tree) (cloud VLM) (local UI-TARS)
Tier 0 — Accessibility tree (default)
| Property | Detail |
|---|---|
| Input | Accessibility tree + Set-of-Marks overlay |
| Model | Text LLM (e.g. Claude Sonnet) |
| Cost | Lowest |
| Privacy | No screenshots leave your deployment if tier capped at 0 |
| Best for | Buttons, links, form fields with accessible names |
Design goal: Most steps on typical SaaS apps resolve at Tier 0. Live distribution varies by site; check per-run tier analytics in the dashboard.
Tier 1 — Cloud vision (VLM)
| Property | Detail |
|---|---|
| Input | Screenshot + bounding-box candidates |
| Model | Gemini Flash (configurable) |
| Trigger | Icon-only buttons, canvas, ambiguous widgets, crawl interactive clicks |
| Requires | GOOGLE_API_KEY or hosted vision routing |
Tier 1 emits a tier_escalation event with a human-readable reason.
Tier 2 — Local vision (optional)
| Property | Detail |
|---|---|
| Input | Screenshot to on-prem MLX server |
| Model | UI-TARS (local) |
| Trigger | Explicit grounding_tier=2 or MAX_GROUNDING_TIER=2 |
| Requires | GROUNDING_T2_SERVER_URL — make grounding-serve locally |
| Use case | Air-gapped deploys, no cloud screenshot policy |
Note:
Set max_grounding_tier=0 on a project to enforce privacy mode — no screenshots sent to cloud vision providers. Interactive crawl and Tier 1 escalation are disabled.
Per-run tier analytics
Run detail shows:
- Pie/bar chart of Tier 0 / 1 / 2 step distribution
- Table of escalation events with reasons
- Estimated LLM cost per run
Configure defaults in Settings → AI & models and per-project caps in Project settings → Grounding & browser.
Verification
Every actor step passes through verification before the planner advances.
| Mechanism | Description |
|---|---|
| Post-action assertion | URL change, element visibility, text content |
| Self-heal | Retry with alternate selectors or tier escalation |
| Failure taxonomy | Structured error codes for export and Trace |
| HITL | Pause for human when automation cannot proceed |
When verification fails after retries, the run moves to failed with a step-level error message, screenshot evidence, and a linked Trace for debugging.
Export pipeline
Export transforms a completed trajectory into a Playwright test project your team can run in CI.
Seven stages
prune → rank selectors → cluster POMs → harvest assertions
→ generate (LLM) → lint → verify + heal
| Stage | Purpose |
|---|---|
| Pruner | Remove noisy or redundant steps |
| Selector ranker | Prefer stable locators (getByRole, getByTestId) |
| POM clusterer | Group selectors into page objects |
| Assertion harvester | Collect verifiable oracles from the trajectory |
| Generator | Write .spec.ts and page-object files |
| Linter | Enforce Cartographer style rules (no elementHandle, no waitForTimeout) |
| Verifier + healer | Run generated tests in sandbox; Gemini repairs failures — sandbox E2E in production is still maturing |
Export outputs
| Output | Description |
|---|---|
*.spec.ts | Runnable Playwright tests |
| Page objects | Clustered selectors per route pattern |
playwright.config.ts | Stock Chromium by default |
| Zip archive | Full project download |
Export options
| Option | Meaning |
|---|---|
style | poms (page-object model) or flat |
target_repo | GitHub repo for PR creation |
uses_cloakbrowser_in_ci | Opt into stealth browser in CI — off by default |
Edit & regenerate: provide a natural-language hint after a failed lint/verify; the healer incorporates your guidance.
Note:
Exported tests are designed for your CI — they do not require Cartographer at runtime unless you opt into CloakBrowser in CI for bot-protected targets.
Demos (demonstration learning)
Demos are user-recorded sessions that teach the planner how to handle tricky flows. Record mode is available via the unpacked browser extension.
| Concept | Description |
|---|---|
| Record mode | Browser extension captures rrweb events, intents, and screenshots |
| Chunks | Segmented intents with URL patterns and replay strategy |
| Replay | POST /v1/projects/{id}/demos/{id}/replay creates a new agent run |
Record once for MFA, payment, or multi-step wizards; Cartographer reuses the skill on similar goals without re-recording.
See Browser extension for install and consent flows.
Credentials
Authenticated flows use credential aliases — never inline secrets in goals or MCP calls.
| Property | Behavior |
|---|---|
| Storage | Encrypted server-side (Fernet + vault backend in production) |
| Dashboard | Username visible; password never shown after save |
| Runs | Select alias by name; API receives alias reference only |
| TOTP seed | Optional — for time-based OTP automation |
| Extension | Password fields masked; payment/password actions require approval |
Create and manage aliases in Settings → Credentials. Pair with the extension for flows that need a human in the loop.
Traces
Cartographer ships a Trace subsystem in-repo (/traces viewer, /v1/traces API, S3-backed NDJSON). Traces are created when capture/ingest runs — they are not automatically linked from every agent run today unless your pipeline ingests or post-processes artifacts.
| Feature | Status |
|---|---|
Trace list (/traces) | Shipped |
| Step timeline + Events log | Shipped |
| Video playback | Shipped when video.webm exists |
| Debugger chat | Shipped (requires Anthropic key / hosted routing) |
Share links (/r/[shortId], embed) | Shipped |
| Layer 2 replay API | Shipped (POST /v1/traces/{id}/replay) |
| Layer 2 diff UI | Available when a child replay exists |
| DOM replay (rrweb) | Available on trace detail at app.molar.it/trace |
Open traces from /traces in the web app or Open trace on dashboard run detail. List and fetch via MCP: cartographer_list_traces, cartographer_get_trace.
For cross-product debugging, see Trace.
Passive UX findings
During crawls and runs, Cartographer can emit findings without blocking the main goal:
| Source | Detects |
|---|---|
| axe | Accessibility violations |
| odiff | Visual regression vs baseline |
| vlm | Sampled UX issues from screenshots |
Findings appear in the findings inbox, grouped by severity. Dismiss noise, create visual baselines, or promote findings to export assertions.
Fetch via MCP: cartographer_get_findings.
Discover flows
Discover flows analyzes the latest complete route map and proposes batch agent runs — useful after the first crawl of a large app.
- Project hub → Discover flows
- Confirmation shows N proposed flows
- Progress modal enqueues runs
- Monitor in Agent runs list
This bridges discovery (crawl) and execution (runs) without manual goal entry per route.
How concepts map to the dashboard
| Concept | Dashboard location |
|---|---|
| Project / seed URL | /dashboard/projects, project overview |
| Route map | /dashboard/projects/{id}/explore |
| Agent run | /dashboard/projects/{id}/runs, /runs (legacy list) |
| Trace | /traces (separate TS viewer — not inside /dashboard) |
| Export | /dashboard/projects/{id}/exports |
| Demo / recording | /dashboard/projects/{id}/recordings |
| Credentials | /dashboard/projects/{id}/credentials |
| Findings | /dashboard/projects/{id}/findings |
Full information architecture: Dashboard.
Related
| Page | Topic |
|---|---|
| Quick start | Bootstrap and first project |
| Configuration | Model routing and env vars |
| MCP tools | IDE API surface |
| Troubleshooting | Stuck crawls, tier escalation, export lint |