Mender auto-fix

How Mender triages failures, scores confidence, opens fix PRs, and learns from your accept/reject signals.

Mender auto-fix

Mender is Guard's AI remediation layer. When a scenario fails — on a PR, in production, or during shadow-prod — Mender triages the failure, classifies root cause, and optionally opens a fix PR against your application repository. Mender targets your product code, not just the test file.

Human review is always required. Mender never merges its own PRs.

Product page: guard.molar.it#mender


The Mender loop

failed run
    → triage (classify + confidence score)
    → branch by classification
        flake     → retry 3× with backoff
        env       → alert Molar ops
        scenario_bug → self-heal PR against .molar/scenarios/
        product_bug  → fix-PR agent → sandbox validation → GitHub PR
    → human merge (or reject → RLHF signal)
    → Guard re-runs on fix PR → green check

Triage stage

Every failed run enters triage before any code is written. Misclassifying a flake as product_bug annoys engineers; misclassifying a product bug as flake is dangerous.

Classifier inputs

InputPurpose
Error message + stack tracePrimary signal
Last 3 screenshots + 5s pre-failure videoVisual context
Network HAR + console logRequest/response failures
Clone state diffDid a webhook fire? DB row change?
Last 10 commits to stack filesRecent regression suspect
30-day flake historySame scenario flaky before?

Classifier output

{
  "classification": "product_bug",
  "confidence": 0.87,
  "reasoning": "POST /api/subscriptions/upgrade reads req.body.priceId but field renamed in PR #4521",
  "suspected_files": ["app/api/billing/route.ts"],
  "suspected_pr": "https://github.com/org/repo/pull/4521"
}

Actions by classification

ClassificationAction
flakeRe-run 3× (5s, 15s, 45s backoff). If 2/3 pass → mark flaky, no alert
envRe-run on different worker. Still failing → alert Molar on-call, not customer
scenario_bugSelf-heal locators → PR against .molar/scenarios/ when locator drift is detected
product_bugHand off to fix-PR agent

Confidence scores

Confidence is a 0..1 score stored on every mender_attempts row. Use it in the dashboard Mender inbox to prioritize review:

ConfidenceSuggested handling
≥ 0.85High — review patch quickly
0.60 – 0.84Medium — verify triage reasoning
< 0.60Low — likely needs human investigation first

In suggestive mode, low-confidence fixes appear as comments or dashboard suggestions only — not auto-PRs.


Fix-PR stage (product bugs)

Step 1 — Context gathering

The agent assembles:

  • Failed scenario file (verbatim)
  • Assertion + stack + screenshots + DOM diff vs last green run
  • HAR of failed run
  • Suspected file contents (via GitHub API)
  • CONTRIBUTING.md, .cursorrules, AGENTS.md if present
  • Repository map (tree-sitter symbol index)

Step 2 — Patch generation

  • Agent loop: up to 20 LLM calls per attempt (hard cap against token snowball)
  • Tools: read_file, grep, apply_diff, run_scenario_in_sandbox, get_repo_map
  • Patch format: unified diff (aider-style applier)
  • Validation: Run failing scenario in Docker sandbox with Clones — not prod, not staging. Max 3 outer iterations.

Step 3 — PR creation

FieldValue
Branchmolar/mender/fix-{scenarioSlug}-{shortSha}
Authormolar-mender[bot]
Title prefix[Molar Mender]
Labelsmolar-mender, auto-generated, needs-review
Reviewergit blame author of last change to suspected file

PR body includes:

  • Failing scenario link + region + timestamp
  • Suspected cause (LLM reasoning)
  • Inline diff
  • Local verify command: molar run <slug> --against=this-branch
  • Sandbox validation result
  • Token usage + model metadata
  • Reject this fix link (RLHF)

Step 4 — Loop closure

Push branch → PR webhook → Guard runs affected scenarios → green check → human merges.


Mender modes

Configure per repo (molar-guard.config.ts) or per scheduled check:

ModeBehaviorRecommended for
Suggestive (default new customers)Post patch as PR comment or dashboard "Apply" buttonFirst 30–90 days
AggressiveAuto-open fix PR on every confirmed product_bugTeams with high Mender merge rate
OffTriage + alert only; no code writtenRegulated paths, manual-only shops
// molar-guard.config.ts
export default {
  mender: {
    mode: "suggestive",
    forbiddenPaths: [
      "app/api/auth/**",
      "**/payment*",
      "**/security*",
      "migrations/**",
    ],
  },
};

Suggestive mode actions

  • PR-time failure: Patch preview on the failing PR comment with Approve / Reject / Edit
  • Production failure: Dashboard incident panel → Apply fix → converts suggestion to real GitHub PR
  • API: POST /v1/mender/:id/apply

Safety & governance

ControlDetail
Never self-mergeBranch protection enforced
Forbidden pathsDefault: **/payment*, **/auth*, **/security*, **/migrations/* — triage alerts but no patch
Allowed pathsOptional allowlist to narrow scope further
Audit logEvery attempt in mender_attempts — prompt, model, outcome, PR URL; exportable
Token budgetPlan-dependent — see app.molar.it/billing
Circuit breakerMax 3 Mender attempts per scenario per 24h

Use BYOK for Mender inference — customer keys bill upstream directly; Molar-hosted caps do not apply.


Reject and accept signals (learning loop)

Reject

Close PR without merging, click Reject this fix in PR body, or call:

curl -X POST https://api.molar.it/v1/mender/$ATTEMPT_ID/reject \
  -H "Authorization: Bearer $MOLAR_API_KEY" \
  -d '{"reason": "Wrong root cause — auth middleware issue, not billing"}'

Reasons feed RLHF for prompt improvement. Optional taxonomy: wrong file, wrong fix, intentional change, flake misclassified.

Accept

When a Mender PR merges, Guard stores the final diff (including human edits). The delta between Mender's original patch and merged code is high-value training signal.


Mender dashboard

At app.molar.it/dashboard/guardMender (or Fixbots in combined bundle):

Inbox list:

  • Filter: pending review, classification, outcome, repo
  • Columns: classification, confidence, scenario, PR URL, token cost, outcome

Attempt detail:

  • Triage rationale + evidence attachments
  • Diff preview (GET /v1/mender/:id/preview)
  • Reject with reason
  • Apply (suggestive → real PR)
  • Budget + circuit breaker status

MCP tools for agents

Standalone Guard MCP (stdio transport) is available for self-hosted deployments:

Tool / resourceAction
molar://incidents/{orgId}Open incidents resource
molar://orgs/{orgId}/runs/{runId}/traceRun trace resource
molar://scenarios/{id}/healthcheckScenario health resource
molar_rerunRe-enqueue a failed run
molar_mender_rejectSend RLHF rejection
molar_ack_incidentAcknowledge incident
molar_suppress_incidentSuppress incident
molar_list_scenariosList scenarios for org

The unified 22-tool Molar MCP connector (mcp.molar.cloud) proxies additional tools via the Molar API. See Webhooks & API.


Model routing

Mender uses separate models per capability (configurable in Settings → AI):

CapabilityTypical model
mender_triageFast text (Haiku-class)
mender_fix_agentOpus/Sonnet-class
self_heal_visionVision-capable VLM

Default is Molar-hosted; BYOK supported per BYOK docs.


What Mender is not

  • Not a merge bot — humans approve every fix
  • Not silent test healing — scenario changes always arrive as PRs
  • Not a replacement for code review — patches can be wrong; confidence scores guide priority
  • Not allowed on forbidden paths — payment/auth/security/migrations alert only

Next