Shadow-prod diff

Parallel production and Clone runs, semantic diff, enabling shadow_prod, and the review UI.

Shadow-prod diff

Shadow-prod diff is Guard's unique capability: run the same scenario against real production and a Clones replica in parallel, then semantically diff the observations.

The alert is not "your app broke" — it is "your model of the third-party world is no longer correct." Stripe shipped a new webhook field. Twilio changed callback format. Clerk added a JWT claim. Your application code may be fine; your Clones (and tests) are stale.

Pitch line: We don't just test your code. We test that your model of the third-party world is still correct.

Dashboard: app.molar.it/dashboard/guardShadow-diff review · Product page: guard.molar.it


Why shadow-prod exists

Traditional monitoring answers: did the user journey complete?

Shadow-prod answers: did production behave the same way our isolated test environment expects?

Failure modePR gating catches?Prod synthetics catch?Shadow-prod catches?
App regression (broken button)Sometimes
Stripe webhook payload shape change✗ (Clone still old shape)Maybe (if assertion checks field)
Twilio MessageStatus format driftMaybe
Clerk new required JWT claimMaybe
S3 new error response shapeMaybe

This is only possible because Molar owns Clones — stateful fakes for Stripe, Twilio, Email, Clerk, S3 with deterministic IDs and virtual clocks.


How it works

On each scheduled check tick where shadow_prod = true:

                    ┌──► Real prod run (synthetic user, money-flow blocked)
Cron tick ──────────┤
                    └──► Clones replica (deterministic seed per 5-min window)
                    Compare RunObservation JSONB
              Divergence? → shadow_diff incident + review UI

Parallel runs

  1. Real prod — synthetic user, full journey against production URL. Money-flow middleware ensures no real charges. Same signals as Production monitoring (X-Synthetic-Source, __MOLAR_SYNTHETIC__, etc.).

  2. Clones replica — same scenario steps against Clones bundle. Seed = sha256(scenarioId || timestamp/300s) — stable within a 5-minute window so paired runs compare apples to apples.

Both runs emit a RunObservation document:

{
  "response_bodies": { "step_3": "semanticHashOfBody", "step_7": "…" },
  "webhook_events_fired": [
    { "provider": "stripe", "event_type": "customer.subscription.updated", "payload_semantic_hash": "abc123" }
  ],
  "dom_state_per_step": [{ "step": 7, "role_tree_hash": "…" }],
  "headers_per_step": ["…"]
}

Semantic diff engine

The diff compares prod vs Clone observations:

LayerWhat differs
Response bodiesSemantic JSON diff per step — ignores volatile fields (id, created, *_at, request_id, ETags, Date headers)
Webhook eventsDid Stripe in prod fire a field the Clone doesn't model?
DOM stateAccessibility role-tree hash per step
HeadersNormalized header comparison per step

Customer-configurable ignore allowlist for known volatile fields.

Alert threshold

Default: significant divergence when any of:

  • 1+ field differs in a webhook payload
  • New field appears in prod webhook not present in Clone
  • DOM role-tree distance > 5%

Threshold is configurable per scheduled check in alert_policy.

On breach: open a shadow_diff incident (run_type: shadow_diff in guard_runs).


Enable shadow_prod

Shadow-prod is configured per scheduled check, not globally.

Scenario frontmatter

---
id: stripe-subscription-upgrade
schedule:
  cron: "*/5 * * * *"
  regions: [us-east-1, eu-west-1, ap-south-1]
  shadow_prod: true
---

CLI / API

When creating a schedule:

pnpm molar-guard schedule create stripe-subscription-upgrade \
  --cron "*/5 * * * *" \
  --shadow-prod

Or via REST:

POST /v1/scheduled_checks
Content-Type: application/json

{
  "scenarioId": "…",
  "cronExpr": "*/5 * * * *",
  "regions": ["us-east-1", "eu-west-1"],
  "shadowProd": true,
  "alertPolicy": {
    "shadowDiff": true,
    "slackWebhookUrl": "https://hooks.slack.com/services/…"
  }
}

Dashboard

app.molar.it/dashboard/guardMonitoringSchedules → edit check → enable Shadow-prod diff.

When to enable

Enable shadow_prodReason
✓ Scenarios with Stripe/Twilio/Clerk webhooksCatch vendor API drift early
✓ Billing and auth flowsHigh cost of silent model mismatch
✗ Static marketing pagesNo third-party contract to diff
✗ Scenarios without Clones surfacesNothing to compare against

Requires Clones enabled for the org (molar-guard.config.tsclones.enabled: true or platform Clones connection).


Use cases (real examples)

Stripe webhook schema change

Production Stripe fires customer.subscription.updated with new field subscription.trial_settings.end_behavior. Your Clone still models the old payload. Shadow-prod alerts within one 5-minute tick — before a code deploy, before a prod synthetic assertion might fail.

Twilio callback format

MessageStatus callback adds a field or changes enum values. Prod observation differs from Clone; incident type shadow_diff.

Clerk JWT claims

Clerk adds a required claim to session tokens. Prod auth path returns different JWT shape than Clone.

S3 error responses

AWS changes error XML/JSON shape for a bucket policy edge case. Response body semantic hash diverges.


Review UI

When a shadow diff fires, engineers review side-by-side observations before suppressing or updating Clones.

Route: /shadow-diff/[runId] in app.molar.it/dashboard/guard

What you see

PanelContent
Prod observationResponse hashes, webhook events, DOM trees per step
Clone observationSame structure from parallel run
Diff highlightsAdded/removed/changed fields in webhooks; JSON path diffs in bodies
Webhook field diffStripe/Twilio/Clerk-specific field-level highlights
DOM role-tree summaryDistance metric and changed nodes

Actions

ActionEffect
ApproveAcknowledge expected vendor change; optionally snooze rule
RejectEscalate — likely Clone or scenario needs update
SnoozeSuppress alerts for duration (maintenance, known rollout)
Create incidentLink to operational incident workflow
Update ignore listAdd field to semantic diff allowlist (persistent vendor addition)

The review UX is a governed diff approval flow — you approve expected vendor drift before updating baselines or ignore lists.


Shadow-diff incidents

Shadow diffs create incidents distinct from failure-rate incidents:

FieldValue
Typeshadow_diff
guard_runs.run_typeshadow_diff
shadow_diff JSONBFull diff payload on the run row

Alert routing

Set slackWebhookUrl, pagerDutyRoutingKey, or webhookUrl in the scheduled check's alertPolicy. Shadow-diff incidents use the same channels as consecutive-failure alerts — use a dedicated check or webhook endpoint if you want #vendor-drift separate from outage pages.

Mender interaction

Shadow-diff incidents are usually not product_bug — Mender may classify as scenario_bug (update Clone or scenario expectations) or skip fix-PR. Human review in shadow-diff UI is the primary workflow.


Observation storage and replay

Each shadow run stores:

  • artifacts_s3_prefix — screenshots, HAR, trace (same as other run types)
  • shadow_diff JSONB — structured diff result
  • Link to paired prod and Clone run IDs

From Runs detail, filter run_type=shadow_diff to audit historical drift.


Configuration reference

Ignore volatile fields (global default)

Ignored in semantic JSON diff unless overridden:

  • id, created, updated, *_at
  • request_id, trace_id
  • ETags, Date headers

Per-check overrides

Suppress known-benign webhook diffs via suppressedWebhookDiffs in alertPolicy:

{
  "suppressedWebhookDiffs": [
    {
      "provider": "stripe",
      "eventType": "customer.subscription.updated",
      "field": "subscription.metadata.internal_note",
      "diffType": "added"
    }
  ]
}

See Configuration for the full alertPolicy schema.


Operational playbook

1. Vendor announced API change

  1. Expect shadow-diff alert near rollout
  2. Open review UI → confirm field is expected
  3. Approve + add to ignore list OR update Clones bundle version
  4. Cartographer/regenerate scenarios if assertions need new fields

2. Unexpected diff

  1. Treat as potential prod/Clone skew or undiscovered vendor change
  2. Reject → open incident, notify platform team
  3. Run manual scenario: pnpm molar-guard run <slug> --base-url https://prod…
  4. Compare Clone session in Clones dashboard

3. False positive storm after deploy

  1. Snooze 24h while investigating
  2. Check if deploy changed response shapes intentionally
  3. Update scenario or Clone seed if test data drifted

Relationship to PR gating and prod monitoring

  • PR gating never runs shadow-prod (Clones-only is faster and safer pre-merge)
  • Prod monitoring can run with shadow_prod: false — journey pass/fail only
  • Shadow-prod adds the parallel Clone arm on the same cron

One scenario file; enable shadow on schedules that touch external vendors.


Next