Debugger & Replay

Trace ships in three debugging layers: Layer 1 visual forensic replay, Layer 2 deterministic re-execution from step N, and the Debugger — a run-scoped conversational agent. Together they answer what happened, what would happen with a fix, and why — without re-running the entire suite.

Note:

Layer 1 video, share, and Debugger chat are available on trace.molar.it. The full five-ribbon scrubber, DOM/Source modes, and Layer 2 replay modal are on app.molar.it/trace. Both surfaces share the same Trace API.

Architecture overview

┌─────────────────────────────────────────────────────────────────┐
│                     LAYER 1 — visual replay                      │
│  Video + events + (full build) rrweb DOM + five ribbons         │
│  Scrubber sync · share · export · inspector tabs                │
└───────────────────────────────┬─────────────────────────────────┘
                                │
┌───────────────────────────────▼─────────────────────────────────┐
│              LAYER 2 — deterministic re-exec (API shipped)       │
│  Replay from step N · Clone snapshot restore · HAR replay       │
│  Replay from step N · clone restore · step diff                  │
└───────────────────────────────┬─────────────────────────────────┘
                                │
┌───────────────────────────────▼─────────────────────────────────┐
│                  DEBUGGER AI (chat shipped)                    │
│  Run-scoped chat · tool calls · Layer 2 trigger · MCP tools   │
└─────────────────────────────────────────────────────────────────┘

Trace combines all three layers with stateful third-party simulators: Clone state in the inspector, in-dashboard Debugger AI, and Layer 2 replay backed by Clone snapshots.

Layer 1 — rrweb DOM replay

Layer 1 is the visual replay viewer — the default experience when you open any trace.

Capture stack

Source	Wire format	Purpose
Playwright trace	`trace.playwright.zip`	Action attribution, Source viewer mode
rrweb v2 (injected)	`dom.mutation`, `dom.snapshot` in NDJSON	Smooth DOM scrubbing
Playwright video	`video.webm`	Video viewer mode
Network listeners	`network.request/response`	NETWORK ribbon
Console listeners	`console.log/error`	CONSOLE ribbon
Clone polling	`clone.state`	Clones inspector tab

Playwright snapshots are coarse (per action). rrweb captures every DOM mutation between actions. Both are required: Playwright for action attribution, rrweb for smooth scrubbing.

rrweb integration

Library: rrweb 2.0.x (MIT, upstream — no fork)
Player: @posthog/rrweb-player (MIT), modified for Molar time-axis controls
Postprocess: trace-postprocess worker extracts rrweb-events.ndjson.zst for the viewer

Privacy masking applies at capture time via scenario frontmatter: mask_inputs, mask_text, block selectors.

Scrubber synchronization

One monotonic time axis spans the full run. Video currentTime, rrweb player position, Network tab filter, Console tab filter, and AGENT ribbon all read the same scrubMs pointer.

Failure diamond marks the first failed step. Play at 0.5×–4× speed for walkthroughs.

Layer 1 limitations (by design):

Post-mortem only — no code editing
No deterministic re-run — use Layer 2
No PR creation — use Mender hand-off

Full viewer layout: Trace detail.

Layer 2 — deterministic replay with Clones

Layer 2 answers: if I fix the code at step N, does the rest of the run pass? — without re-running steps 1…N−1.

The workflow

Open a failed trace at Trace detail
Select step N in the sidebar (or press N to jump to failure)
Press R or click Replay — slide-in panel opens
Edit scenario file and/or source files inline (resolved via GitHub App)
Click Re-run — Layer 2 worker executes the replay pipeline
Child trace streams into side-by-side diff view
If fix confirmed → Promote to fix → Mender hand-off

What gets restored

Nondeterminism	Capture	Replay
Clone seeds	`run.start.seeds.{clone}`	`POST /_clone/seed`
Clone state at N−1	`clone.snapshot` events	`POST /_clone/snapshot/restore`
Virtual clock	`clock.advance` per clone	`POST /_clone/clock/set`
RNG (Node/browser)	`scenario.event`	`MOLAR_REPLAY_RNG_SEED` env
Non-clone HTTP	Playwright HAR	`page.route()` from recorded responses
Browser state	rrweb full snapshot	cookies, localStorage, sessionStorage
User input	Scenario file	Played from step N onward

Layer 2 worker pipeline

The trace-layer2 queue (Hatchet worker) executes:

Validate source trace is hot tier (or trigger cold restore)
Resolve code_sha → Docker image or fresh build
Restore each Clone from snapshot at step N−1
Set virtual clock to step N timestamp
Re-seed RNG
Configure Playwright network interception from HAR
Restore browser state from rrweb snapshot
Apply user edits (scenario diff + source diffs)
Start Playwright at step N (skip 1…N−1 via step reporter)
Stream new trace → child row with parent_trace_id set

Poll status via GET /v1/replays/{id} or Replay inspector tab.

Side-by-side diff view

Route: /traces/{sourceId}/replay/{replayId} (child trace with parent_trace_id)

Shipped today: step status diff (parent vs child) in the Diff panel.

Target tabs:

Tab	Shows
Overview	Fix summary, stats (steps fixed, network changes, duration delta)
Steps	Per-step status source vs replay
Network	Method, URL, status deltas — red on mismatch
DOM	rrweb subtree diffs after step N
Clone state	JMESPath-aware JSON diff per clone
Console	Lines unique to each side
Patch	Unified diff of applied source edits

Single scrubber controls both viewers.

Layer 2 constraints

Constraint	Detail
Hot tier required	Cold restore takes 30–90s (Glacier Instant Retrieval)
Buildable code	Docker image per SHA or Dockerfile at repo root
Wall-clock in app code	Requires `@molar/time` middleware — warning banner if absent
Real entropy in app code	Requires `@molar/random` middleware

Documented limitations surface as UI banners before replay starts (API returns 409 on cold tier today).

Debugger AI

The Debugger is a chat panel docked in the Debugger inspector tab. It has read-only access to one trace's artifacts — it explains, diffs, and (with confirmation) triggers Layer 2. It does not open PRs.

First message

Every new chat on a failed trace auto-generates a forensic summary from:

Failure signature
Last assertion event
Last network error (status ≥ 400)
Mender classification (if exists)

Green (passed) traces: Debugger tab available but collapsed by default.

Example transcript

Engineer: why did step 7 fail?

Debugger: Step 7 was "Click 'Upgrade to Pro'." The click succeeded — dom.mutation at +1.2s, network.request to POST /api/checkout/upgrade returned 200. But navigation to /billing?upgraded=true never fired. The Stripe Clone dispatched customer.subscription.created at +2.4s, yet your webhook returned 404. Config at playwright.config.ts:42 points to /api/stripe/webhook; the handler lives at /api/webhooks/stripe (renamed in PR #4521). Cite: (seq:847) (seq:852).

Engineer: show me what happens with the correct URL

Debugger: I can Layer 2 replay from step 7 with the Stripe Clone reconfigured. ~25s. Proceed?

Tool surface

Implemented as in-process MCP tools via Claude Agent SDK:

Tool	Purpose	Cost class
`read_artifact(kind, step_id?)`	NDJSON slice by event kind	Cheap
`query_trace(filter)`	DSL: `kind=network.response AND status>=400`	Cheap
`diff_runs(other_id, dimension)`	Network / DOM / clone / assertion diff	Medium
`query_clone_state(clone, step, path)`	JMESPath into clone snapshot	Cheap
`run_layer2_replay(from_step, edits)`	Side-effecting — requires confirmation	Expensive
`summarize_mender_analysis()`	Read existing Mender row	Cheap

Deliberately excluded: open_pull_request, edit_file — Mender owns write actions.

Citations

Every Debugger claim must cite (seq:N) event references or (file:line) source locations. Citation chips are keyboard-focusable; click jumps scrubber to that event.

Validated by prompt unit tests in CI.

Model and budget

Setting	Default
Model	Claude Opus 4.7
Cheap mode	Claude Haiku 4.5 (short queries)
Org budget	Configurable cents/month in Settings → Debugger
Hard cutoff	402 at 100% cap — chat input disabled

BYOK path: Settings → Debugger → Anthropic API key or Bedrock/Vertex endpoint.

SSE streams responses: POST /v1/chats/{id}/messages → type=done on completion.

MCP for coding agents

Trace exposes capabilities via MCP at https://mcp.molar.cloud (OAuth 2.1 + PKCE).

Tools

MCP tool	REST equivalent	Purpose
`list_traces(filters)`	`GET /v1/traces`	Discover runs
`get_trace(id)`	`GET /v1/traces/{id}`	Metadata + signed URLs
`read_trace_slice(id, kind, step)`	`GET /v1/traces/{id}/events`	Event slice
`diff_traces(a, b, dimension)`	Debugger `diff_runs`	Semantic diff
`trigger_replay(id, from_step, edits)`	`POST /v1/traces/{id}/replay`	Layer 2 (confirmation required)

Scopes

Scope	Allows
`mcp:trace:read`	list, get, read events
`mcp:debug`	Debugger context attachment
`mcp:run`	Layer 2 replay trigger

Cursor / Claude Code setup

{
  "mcpServers": {
    "molar": {
      "url": "https://mcp.molar.cloud",
      "auth": "oauth"
    }
  }
}

Agent workflow example

CI fails → check output includes short_id
Agent calls get_trace("xY9zQ2mNp") → failure summary + signed URLs
Agent asks Debugger via molar_debug_run — "why did step 7 fail?"
Engineer patches locally
Agent calls trigger_replay(from_step=6, edits={…})
Human reviews diff → Promote to Mender

Solo dev local path: PAT on standalone Cartographer MCP (development only).

Comparison matrix

Capability	Layer 1	Layer 2	Debugger
Watch what happened	✓	—	Explains
Scrub DOM/video	✓	—	Cites events
Edit and re-run	—	✓	Triggers with confirm
Clone state diff	View only	✓ side-by-side	`query_clone_state`
Open fix PR	—	Promote button	Never (Mender)
IDE access	Export NDJSON	MCP trigger	MCP chat

Page	When
Trace detail	Viewer layout and ribbons
Mender hand-off	After Layer 2 confirms fix
Integrations	MCP OAuth, GitHub App for source reads
API reference	REST + MCP full spec
Workers & ingestion	Postprocess and Layer 2 workers

Trace detail Mender hand-off