Troubleshooting & FAQ

Common Trace issues — missing traces, cold tier replay, Debugger budget, encryption key errors, and Playwright comparison FAQ.

Troubleshooting & FAQ

Frequently asked questions

What does Trace add beyond a Playwright trace zip?

Trace captures continuous forensic ribbons — video, DOM, network, console, and agent thoughts on one playhead — with cloud aggregation, signed permalinks, Debugger AI, Layer 2 replay with Clone state restore, and MCP for coding agents. Every run still bundles trace.playwright.zip for compatibility with npx playwright show-trace.

What is Layer 2 replay?

Layer 2 lets you step backward through a failure, apply a test or scenario fix, and validate from that step — restoring Clone state and browser storage when needed — without re-running the entire suite. A child trace is captured and diffed side-by-side with the source.

How does Trace connect to Mender?

When Guard confirms a regression, the failing trace opens in Trace for forensic replay. Debugger cites events on the five-ribbon timeline; when you confirm the fix (often after Layer 2), Promote to fix PR hands the diff to Mender, which drafts a pull request with regression tests. A human always approves the merge.

Can Trace help with CI failure triage?

Yes. Trace ingests Playwright failures from CI (native capture or zip ingest), groups evidence on a navigable timeline, and exposes trace data to agents via MCP. Failure signature clusters are available via API filter (?signature=).

Does Trace record real user sessions?

No. Trace captures synthetic test runs (Guard, Cartographer agent, CI). For production-shaped monitoring, use Guard scheduled checks — still synthetic, not RUM.

What is a failure signature cluster?

A stable hash (failure_signature) derived from the failing assertion, dominant network error, and normalized stack. Traces with the same sig in 24h appear in the cluster strip — e.g. webhook-404 × 7.

Share links use unguessable 9-character shortId tokens (~51 bits). Default visibility is org-only; public-read is opt-in with expiry. Treat forwarded links like capability secrets — masking defaults reduce PII exposure.

What does pinned mean?

Pinned traces skip storage tier-down (hot → warm → cold). Pin traces manually with F on app.molar.it/trace. Unpinned traces follow org retention policy.


Troubleshooting

Server won't start: TRACE_ENCRYPTION_KEY

Symptom:

TRACE_ENCRYPTION_KEY is required and must be stable

Fix:

export TRACE_ENCRYPTION_KEY=$(openssl rand -hex 32)
npm run dev

Key must be 64 hex characters (32 bytes). Docker/Helm: mount secret into pod env.


Trace missing after Guard run

Note:

If a Guard check did not produce a trace row, upload via the ingest API or Cartographer capture.

CheckAction
Run still running?Wait for run.end + postprocess (~30s)
Worker healthHelp page → Hatchet workers online
Wrong org?Verify org switcher matches Guard project org
Internal tokenSelf-hosted only: internal_service_token matches ingest webhook

Trace stuck in queued or running

  • queued: Ingest accepted but capture never started — check Guard runner logs
  • running (hours): Zombie run — cancel source job; admin abort endpoint (future)
  • Postprocess backlog: Redis queue depth on Help page

Layer 2 replay returns 409

Cause: Source trace is cold or warm without full blobs.

Fix:

POST /v1/traces/{id}/restore

Wait for eta_seconds, refresh detail until tier=hot, retry replay.


DOM player empty or "0 mutations"

CauseFix
Postprocess incompleteWait; check rrweb-events.ndjson.zst in Artifacts
Playwright-only ingestRe-run with native @molar/trace-capture
Privacy block selectorReview scenario privacy — page may be fully blocked
Cold tierRestore to hot

Debugger returns budget exceeded

Org monthly Debugger cap reached (Settings → Debugger → Budget).

  • Enable cheap mode (Haiku-class model)
  • Raise cap (Admin)
  • BYO Anthropic key on Team+ plans
  • Wait for monthly reset

429 on chat endpoint includes budget details.


Blob fetch 403

Caller org does not own a trace referencing that SHA-256. Blobs are global dedup — authorization is always via trace ownership, never hash alone.


API 400 on list filters

Invalid status, tier, or kind values return structured validation errors. Use documented enums only — see API reference.

Example: status=flaky400 (use failed).


  • Link past share_expires_at
  • Trace deleted or archived
  • Visibility flipped to org-only (public URL revoked)
  • Typo in shortId

Create a new share from trace detail.


rrweb iframe blank (CSP)

Parent page Content-Security-Policy may block iframe scripts. Trace embed /r/{shortId}/embed ships compatible CSP headers — host CSP may override on custom domains.


Clones panel shows "No state"

  • Scenario did not register Clones for this run
  • Capture ran before Clone admin endpoints were ready
  • Step boundary before first Clone interaction

Verify Clones integration in scenario .molar.md and Integrations.


AGENT ribbon empty

Run was not Cartographer-sourced — plain Playwright tests have no agent.thought events. Expected behavior.


Presigned URL expired

signed_urls TTL default 3600s. Refetch GET /v1/traces/{id} to renew before downloading large artifacts.


Standalone vs production behavior differs

Standalone Trace uses a local JSON store and simplified workers. Molar Cloud adds SSO, org isolation, Hatchet postprocess, and multi-region storage. See Quick start.


Performance

SymptomTip
Slow detail loadLarge trace — check total_size_bytes; warm tier may omit video
Scrubber lag>50k events — filter Events tab by kind
SSE disconnectsNetwork proxy timeout — client reconnects with banner

Performance budgets are enforced in CI for the standalone Trace package.


Getting help

ChannelUse for
Help page (/help)Worker status, integration health
support@molar.itAccount, billing, security reports
SupportWhat to include in a ticket

Include: trace short_id, org slug, approximate created_at, and whether Guard or manual ingest.


Error code index

Platform-wide HTTP and quota errors: Error codes.

Trace-specific:

CodeTrace context
trace_not_hotLayer 2 on cold/warm trace
debugger_budget_exceededMonthly LLM cap
blob_not_in_manifestSHA not referenced by trace
share_expiredPublic link past TTL