Debugger & Replay

Layer 1 rrweb DOM replay, Layer 2 deterministic re-execution with Clones, Debugger AI tools, and MCP for coding agents.

Trace ships in three debugging layers: Layer 1 visual forensic replay, Layer 2 deterministic re-execution from step N, and the Debugger — a run-scoped conversational agent. Together they answer what happened, what would happen with a fix, and why — without re-running the entire suite.

Note:

Layer 1 video, share, and Debugger chat are available on trace.molar.it. The full five-ribbon scrubber, DOM/Source modes, and Layer 2 replay modal are on app.molar.it/trace. Both surfaces share the same Trace API.


Architecture overview

┌─────────────────────────────────────────────────────────────────┐
│                     LAYER 1 — visual replay                      │
│  Video + events + (full build) rrweb DOM + five ribbons         │
│  Scrubber sync · share · export · inspector tabs                │
└───────────────────────────────┬─────────────────────────────────┘
┌───────────────────────────────▼─────────────────────────────────┐
│              LAYER 2 — deterministic re-exec (API shipped)       │
│  Replay from step N · Clone snapshot restore · HAR replay       │
│  Replay from step N · clone restore · step diff                  │
└───────────────────────────────┬─────────────────────────────────┘
┌───────────────────────────────▼─────────────────────────────────┐
│                  DEBUGGER AI (chat shipped)                    │
│  Run-scoped chat · tool calls · Layer 2 trigger · MCP tools   │
└─────────────────────────────────────────────────────────────────┘

Trace combines all three layers with stateful third-party simulators: Clone state in the inspector, in-dashboard Debugger AI, and Layer 2 replay backed by Clone snapshots.


Layer 1 — rrweb DOM replay

Layer 1 is the visual replay viewer — the default experience when you open any trace.

Capture stack

SourceWire formatPurpose
Playwright tracetrace.playwright.zipAction attribution, Source viewer mode
rrweb v2 (injected)dom.mutation, dom.snapshot in NDJSONSmooth DOM scrubbing
Playwright videovideo.webmVideo viewer mode
Network listenersnetwork.request/responseNETWORK ribbon
Console listenersconsole.log/errorCONSOLE ribbon
Clone pollingclone.stateClones inspector tab

Playwright snapshots are coarse (per action). rrweb captures every DOM mutation between actions. Both are required: Playwright for action attribution, rrweb for smooth scrubbing.

rrweb integration

  • Library: rrweb 2.0.x (MIT, upstream — no fork)
  • Player: @posthog/rrweb-player (MIT), modified for Molar time-axis controls
  • Postprocess: trace-postprocess worker extracts rrweb-events.ndjson.zst for the viewer

Privacy masking applies at capture time via scenario frontmatter: mask_inputs, mask_text, block selectors.

Scrubber synchronization

One monotonic time axis spans the full run. Video currentTime, rrweb player position, Network tab filter, Console tab filter, and AGENT ribbon all read the same scrubMs pointer.

Failure diamond marks the first failed step. Play at 0.5×–4× speed for walkthroughs.

Layer 1 limitations (by design):

  • Post-mortem only — no code editing
  • No deterministic re-run — use Layer 2
  • No PR creation — use Mender hand-off

Full viewer layout: Trace detail.


Layer 2 — deterministic replay with Clones

Layer 2 answers: if I fix the code at step N, does the rest of the run pass? — without re-running steps 1…N−1.

The workflow

  1. Open a failed trace at Trace detail
  2. Select step N in the sidebar (or press N to jump to failure)
  3. Press R or click Replay — slide-in panel opens
  4. Edit scenario file and/or source files inline (resolved via GitHub App)
  5. Click Re-run — Layer 2 worker executes the replay pipeline
  6. Child trace streams into side-by-side diff view
  7. If fix confirmed → Promote to fixMender hand-off

What gets restored

NondeterminismCaptureReplay
Clone seedsrun.start.seeds.{clone}POST /_clone/seed
Clone state at N−1clone.snapshot eventsPOST /_clone/snapshot/restore
Virtual clockclock.advance per clonePOST /_clone/clock/set
RNG (Node/browser)scenario.eventMOLAR_REPLAY_RNG_SEED env
Non-clone HTTPPlaywright HARpage.route() from recorded responses
Browser staterrweb full snapshotcookies, localStorage, sessionStorage
User inputScenario filePlayed from step N onward

Layer 2 worker pipeline

The trace-layer2 queue (Hatchet worker) executes:

  1. Validate source trace is hot tier (or trigger cold restore)
  2. Resolve code_sha → Docker image or fresh build
  3. Restore each Clone from snapshot at step N−1
  4. Set virtual clock to step N timestamp
  5. Re-seed RNG
  6. Configure Playwright network interception from HAR
  7. Restore browser state from rrweb snapshot
  8. Apply user edits (scenario diff + source diffs)
  9. Start Playwright at step N (skip 1…N−1 via step reporter)
  10. Stream new trace → child row with parent_trace_id set

Poll status via GET /v1/replays/{id} or Replay inspector tab.

Side-by-side diff view

Route: /traces/{sourceId}/replay/{replayId} (child trace with parent_trace_id)

Shipped today: step status diff (parent vs child) in the Diff panel.

Target tabs:

TabShows
OverviewFix summary, stats (steps fixed, network changes, duration delta)
StepsPer-step status source vs replay
NetworkMethod, URL, status deltas — red on mismatch
DOMrrweb subtree diffs after step N
Clone stateJMESPath-aware JSON diff per clone
ConsoleLines unique to each side
PatchUnified diff of applied source edits

Single scrubber controls both viewers.

Layer 2 constraints

ConstraintDetail
Hot tier requiredCold restore takes 30–90s (Glacier Instant Retrieval)
Buildable codeDocker image per SHA or Dockerfile at repo root
Wall-clock in app codeRequires @molar/time middleware — warning banner if absent
Real entropy in app codeRequires @molar/random middleware

Documented limitations surface as UI banners before replay starts (API returns 409 on cold tier today).


Debugger AI

The Debugger is a chat panel docked in the Debugger inspector tab. It has read-only access to one trace's artifacts — it explains, diffs, and (with confirmation) triggers Layer 2. It does not open PRs.

First message

Every new chat on a failed trace auto-generates a forensic summary from:

  • Failure signature
  • Last assertion event
  • Last network error (status ≥ 400)
  • Mender classification (if exists)

Green (passed) traces: Debugger tab available but collapsed by default.

Example transcript

Engineer: why did step 7 fail?

Debugger: Step 7 was "Click 'Upgrade to Pro'." The click succeeded — dom.mutation at +1.2s, network.request to POST /api/checkout/upgrade returned 200. But navigation to /billing?upgraded=true never fired. The Stripe Clone dispatched customer.subscription.created at +2.4s, yet your webhook returned 404. Config at playwright.config.ts:42 points to /api/stripe/webhook; the handler lives at /api/webhooks/stripe (renamed in PR #4521). Cite: (seq:847) (seq:852).

Engineer: show me what happens with the correct URL

Debugger: I can Layer 2 replay from step 7 with the Stripe Clone reconfigured. ~25s. Proceed?

Tool surface

Implemented as in-process MCP tools via Claude Agent SDK:

ToolPurposeCost class
read_artifact(kind, step_id?)NDJSON slice by event kindCheap
query_trace(filter)DSL: kind=network.response AND status>=400Cheap
diff_runs(other_id, dimension)Network / DOM / clone / assertion diffMedium
query_clone_state(clone, step, path)JMESPath into clone snapshotCheap
run_layer2_replay(from_step, edits)Side-effecting — requires confirmationExpensive
summarize_mender_analysis()Read existing Mender rowCheap

Deliberately excluded: open_pull_request, edit_file — Mender owns write actions.

Citations

Every Debugger claim must cite (seq:N) event references or (file:line) source locations. Citation chips are keyboard-focusable; click jumps scrubber to that event.

Validated by prompt unit tests in CI.

Model and budget

SettingDefault
ModelClaude Opus 4.7
Cheap modeClaude Haiku 4.5 (short queries)
Org budgetConfigurable cents/month in Settings → Debugger
Hard cutoff402 at 100% cap — chat input disabled

BYOK path: Settings → Debugger → Anthropic API key or Bedrock/Vertex endpoint.

SSE streams responses: POST /v1/chats/{id}/messagestype=done on completion.


MCP for coding agents

Trace exposes capabilities via MCP at https://mcp.molar.cloud (OAuth 2.1 + PKCE).

Tools

MCP toolREST equivalentPurpose
list_traces(filters)GET /v1/tracesDiscover runs
get_trace(id)GET /v1/traces/{id}Metadata + signed URLs
read_trace_slice(id, kind, step)GET /v1/traces/{id}/eventsEvent slice
diff_traces(a, b, dimension)Debugger diff_runsSemantic diff
trigger_replay(id, from_step, edits)POST /v1/traces/{id}/replayLayer 2 (confirmation required)

Scopes

ScopeAllows
mcp:trace:readlist, get, read events
mcp:debugDebugger context attachment
mcp:runLayer 2 replay trigger

Cursor / Claude Code setup

{
  "mcpServers": {
    "molar": {
      "url": "https://mcp.molar.cloud",
      "auth": "oauth"
    }
  }
}

Agent workflow example

  1. CI fails → check output includes short_id
  2. Agent calls get_trace("xY9zQ2mNp") → failure summary + signed URLs
  3. Agent asks Debugger via molar_debug_run — "why did step 7 fail?"
  4. Engineer patches locally
  5. Agent calls trigger_replay(from_step=6, edits={…})
  6. Human reviews diff → Promote to Mender

Solo dev local path: PAT on standalone Cartographer MCP (development only).


Comparison matrix

CapabilityLayer 1Layer 2Debugger
Watch what happenedExplains
Scrub DOM/videoCites events
Edit and re-runTriggers with confirm
Clone state diffView only✓ side-by-sidequery_clone_state
Open fix PRPromote buttonNever (Mender)
IDE accessExport NDJSONMCP triggerMCP chat

PageWhen
Trace detailViewer layout and ribbons
Mender hand-offAfter Layer 2 confirms fix
IntegrationsMCP OAuth, GitHub App for source reads
API referenceREST + MCP full spec
Workers & ingestionPostprocess and Layer 2 workers