Trace List

Navigate, filter, and triage traces — saved views, failure signature clusters, columns, and the on-call workflow.

The Traces list (/traces) is Trace's primary working surface. Every ingested run appears as a searchable row with status, failure signature, source metadata, and retention tier.

Open it: trace.molar.it/traces

Note:

The trace list includes a paginated table, status filter, pin indicator, failure signature column, and cursor pagination. Filter by signature via the API (?signature=) for cluster-style triage.


Page layout

┌─────────────────────────────────────────────────────────────────┐
│ Header: workspace name · Ingest · Compare two                   │
├─────────────────────────────────────────────────────────────────┤
│ Saved views: All · Failed·24h · Pinned · Running · …           │
├─────────────────────────────────────────────────────────────────┤
│ Search bar · Scenario filter · More filters · Export CSV        │
├─────────────────────────────────────────────────────────────────┤
│ [Failure signature clusters strip — when Failed·24h view active]│
├─────────────────────────────────────────────────────────────────┤
│ Trace table (paginated, cursor-based)                           │
└─────────────────────────────────────────────────────────────────┘

Click any row to open Trace detail. Pin/unpin from the first column without opening the trace.


Table columns

ColumnSource fieldNotes
Pintraces.pinnedToggle blocks tier-down; failures auto-pin by policy
Statustraces.status + tierPills: queued, running, passed, failed, error, aborted
Scenarioscenarios.titleLinks to scenario detail; primary grouping dimension
Failure signaturetraces.failure_signatureCluster hash; shows ×N when multiple traces share sig
Sourceguard_runs.run_typePR #, branch, scheduler, manual, shadow_diff
Durationtraces.duration_msWall-clock run time
Codetraces.code_shaGit commit / build hash (7-char display)
Trace IDtraces.short_id9-char base62 permalink slug
Whentraces.created_atRelative time ("22m ago")

Hover a row for storage tooltip: NDJSON size, blob count, dedup ratio, retention tier.

Status pills

StatusMeaningTypical next action
runningLive capture in progressWatch SSE stream on detail page
failedAssertion or step failureOpen detail → Debugger
passedClean runArchive or compare against failure
queuedIngest accepted, postprocess pendingWait ~2s
errorWorker/capture faultCheck Workers & ingestion
abortedUser or CI cancelReview partial trace

Tier badge (hot, warm, cold) appears alongside status. Cold traces show a Restore action before Layer 2 replay.


Saved views

Built-in saved views will appear as tabs above the filter bar:

View IDLabelFilter logic
allAllNo status filter — entire org history (paginated)
failed24Failed · 24hstatus=failed AND created_at > now()-24h
pinnedPinnedpinned=true
runningRunningstatus=running — live captures
needs-l2Awaiting L2 replayFailed traces with no child replay yet
this-prPR #Traces from the current PR context (dynamic)

The nav badge on Traces reflects the Failed · 24h count — your on-call signal.

Custom saved views

Click Save filter to persist the current combination of:

  • Search query
  • Scenario dropdown
  • Status / tier / source filters from More filters
  • Date range

Saved views are per-user and sync across sessions. Org admins can publish team views visible to all members (Settings → Views).


Failure signature clusters

When the Failed · 24h view is active, a cluster strip will appear above the table. Each card shows:

FieldExamplePurpose
Signaturewebhook-404Stable hash of failure shape
Count×7Traces sharing this signature in the window
Last seen22m agoRecency for prioritization

Click a cluster card to filter the table by that signature. Signatures group failures that look the same even across different scenarios — e.g. a misconfigured webhook URL affecting checkout and upgrade flows.

Common signature patterns:

SignatureTypical root cause
webhook-404Stripe/webhook route mismatch
stripe-discount-inversionCoupon applied as multiplier not discount
sendgrid-rate-limitEmail clone throttling
auth-expired-tokenSession TTL in test setup
deadlock-postgres-rlsRLS policy blocking test user

Clusters will feed Slack alert rules (Settings → Integrations): e.g. alert #qa-alerts when webhook-404 count ≥ 5 in 1 hour (.


Search and filters

The search bar matches against:

  • short_id (e.g. xY9zQ2mNp)
  • Full UUID
  • Scenario title and slug
  • Failure signature
  • code_sha

Search uses Postgres tsvector on flattened summary.json.

Scenario filter

Dropdown lists all active scenarios with 30-day trace counts. Filter to one scenario to spot regressions in shape, not just volume.

More filters panel

FilterValues
Statuspassed, failed, running, error, aborted
Tierhot, warm, cold
Source typepr, schedule, manual, shadow_diff, cartographer
Date rangeLast 1h, 24h, 7d, 30d, custom
Has Mender attemptyes / no
Has Layer 2 childyes / no
Pinned onlytoggle

Combine filters with saved views. Example: Failed · 24h + scenario checkout-stripe + needs-l2.

Export CSV

Export the current filtered result set (up to 10,000 rows) with all columns plus total_size_bytes, blob_count, and guard_run_id.


Pagination

The list uses cursor pagination via GET /v1/traces?cursor=…. Default page size: 50 rows.

Footer shows Showing 1–50 of 1,240 with Prev / Next controls. Deep links preserve filter state in query params:

/traces?view=failed24&scenario=checkout-stripe&sig=webhook-404

Bulk actions

Select multiple rows (checkbox column) to:

ActionRBACEffect
Pin selectedMember+Block tier-down on all selected
Unpin selectedMember+Allow tier-down per retention policy
ArchiveOwnerSoft-delete; removes from default views
CompareMember+Open /traces/compare with A/B pickers pre-filled

Pinning is the primary retention control. Failed runs auto-pin by default; bulk unpin before archival cleanup.


Triage workflow

Recommended on-call flow when Guard reports a regression:

1 — Land on Failed · 24h

Open /traces?view=failed24. Scan the cluster strip for spikes — one signature with ×7 beats seven unrelated failures.

2 — Pick the canonical trace

Within a cluster, prefer:

  • Most recent failure on the same PR as the regression
  • Trace with highest Mender confidence (if Mender already classified)
  • Longest duration (more evidence)

3 — Open and jump to failure

Detail page opens at the failing step. Press N to jump to next failure if multiple exist.

4 — Ask Debugger

Press D or open Debugger tab. First auto-message summarizes the failure. Ask "why did step N fail?" before reading raw events.

5 — Compare against last green

From Overview tab → last green link opens compare view against the most recent passing trace on the same scenario + branch.

6 — Layer 2 or Mender

7 — Pin and share

Pin the canonical trace for the incident. Share a public-read link (/r/{shortId}) in Slack or Linear for stakeholders who lack dashboard access.


Empty states

ConditionMessageAction
No traces ever"Nothing here yet"Ingest or connect Guard
Filters too narrow"Adjust your filters"Clear filters link
Postprocess pendingRow shows queuedWait ~2s; refresh

API equivalent

curl "$TRACE_API/v1/traces?status=failed&from=24h&limit=50" \
  -H "Authorization: Bearer $MOLAR_API_KEY"

Response shape: { "items": [TraceSummary…], "next_cursor": "…" }.

MCP: list_traces({ status: "failed", from: "24h" }) — see API reference.


PageWhen
Trace detailOpen a row — viewer, ribbons, inspector
Debugger & replayAfter identifying the canonical failure
Mender hand-offFix confirmed — ship PR
IntegrationsSlack cluster alerts, GitHub check links