Mender auto-fix
Mender is Guard's AI remediation layer. When a scenario fails — on a PR, in production, or during shadow-prod — Mender triages the failure, classifies root cause, and optionally opens a fix PR against your application repository. Mender targets your product code, not just the test file.
Human review is always required. Mender never merges its own PRs.
Product page: guard.molar.it#mender
The Mender loop
failed run
→ triage (classify + confidence score)
→ branch by classification
flake → retry 3× with backoff
env → alert Molar ops
scenario_bug → self-heal PR against .molar/scenarios/
product_bug → fix-PR agent → sandbox validation → GitHub PR
→ human merge (or reject → RLHF signal)
→ Guard re-runs on fix PR → green check
Triage stage
Every failed run enters triage before any code is written. Misclassifying a flake as product_bug annoys engineers; misclassifying a product bug as flake is dangerous.
Classifier inputs
| Input | Purpose |
|---|---|
| Error message + stack trace | Primary signal |
| Last 3 screenshots + 5s pre-failure video | Visual context |
| Network HAR + console log | Request/response failures |
| Clone state diff | Did a webhook fire? DB row change? |
| Last 10 commits to stack files | Recent regression suspect |
| 30-day flake history | Same scenario flaky before? |
Classifier output
{
"classification": "product_bug",
"confidence": 0.87,
"reasoning": "POST /api/subscriptions/upgrade reads req.body.priceId but field renamed in PR #4521",
"suspected_files": ["app/api/billing/route.ts"],
"suspected_pr": "https://github.com/org/repo/pull/4521"
}
Actions by classification
| Classification | Action |
|---|---|
| flake | Re-run 3× (5s, 15s, 45s backoff). If 2/3 pass → mark flaky, no alert |
| env | Re-run on different worker. Still failing → alert Molar on-call, not customer |
| scenario_bug | Self-heal locators → PR against .molar/scenarios/ when locator drift is detected |
| product_bug | Hand off to fix-PR agent |
Confidence scores
Confidence is a 0..1 score stored on every mender_attempts row. Use it in the dashboard Mender inbox to prioritize review:
| Confidence | Suggested handling |
|---|---|
| ≥ 0.85 | High — review patch quickly |
| 0.60 – 0.84 | Medium — verify triage reasoning |
| < 0.60 | Low — likely needs human investigation first |
In suggestive mode, low-confidence fixes appear as comments or dashboard suggestions only — not auto-PRs.
Fix-PR stage (product bugs)
Step 1 — Context gathering
The agent assembles:
- Failed scenario file (verbatim)
- Assertion + stack + screenshots + DOM diff vs last green run
- HAR of failed run
- Suspected file contents (via GitHub API)
CONTRIBUTING.md,.cursorrules,AGENTS.mdif present- Repository map (tree-sitter symbol index)
Step 2 — Patch generation
- Agent loop: up to 20 LLM calls per attempt (hard cap against token snowball)
- Tools:
read_file,grep,apply_diff,run_scenario_in_sandbox,get_repo_map - Patch format: unified diff (aider-style applier)
- Validation: Run failing scenario in Docker sandbox with Clones — not prod, not staging. Max 3 outer iterations.
Step 3 — PR creation
| Field | Value |
|---|---|
| Branch | molar/mender/fix-{scenarioSlug}-{shortSha} |
| Author | molar-mender[bot] |
| Title prefix | [Molar Mender] |
| Labels | molar-mender, auto-generated, needs-review |
| Reviewer | git blame author of last change to suspected file |
PR body includes:
- Failing scenario link + region + timestamp
- Suspected cause (LLM reasoning)
- Inline diff
- Local verify command:
molar run <slug> --against=this-branch - Sandbox validation result
- Token usage + model metadata
- Reject this fix link (RLHF)
Step 4 — Loop closure
Push branch → PR webhook → Guard runs affected scenarios → green check → human merges.
Mender modes
Configure per repo (molar-guard.config.ts) or per scheduled check:
| Mode | Behavior | Recommended for |
|---|---|---|
| Suggestive (default new customers) | Post patch as PR comment or dashboard "Apply" button | First 30–90 days |
| Aggressive | Auto-open fix PR on every confirmed product_bug | Teams with high Mender merge rate |
| Off | Triage + alert only; no code written | Regulated paths, manual-only shops |
// molar-guard.config.ts
export default {
mender: {
mode: "suggestive",
forbiddenPaths: [
"app/api/auth/**",
"**/payment*",
"**/security*",
"migrations/**",
],
},
};
Suggestive mode actions
- PR-time failure: Patch preview on the failing PR comment with Approve / Reject / Edit
- Production failure: Dashboard incident panel → Apply fix → converts suggestion to real GitHub PR
- API:
POST /v1/mender/:id/apply
Safety & governance
| Control | Detail |
|---|---|
| Never self-merge | Branch protection enforced |
| Forbidden paths | Default: **/payment*, **/auth*, **/security*, **/migrations/* — triage alerts but no patch |
| Allowed paths | Optional allowlist to narrow scope further |
| Audit log | Every attempt in mender_attempts — prompt, model, outcome, PR URL; exportable |
| Token budget | Plan-dependent — see app.molar.it/billing |
| Circuit breaker | Max 3 Mender attempts per scenario per 24h |
Use BYOK for Mender inference — customer keys bill upstream directly; Molar-hosted caps do not apply.
Reject and accept signals (learning loop)
Reject
Close PR without merging, click Reject this fix in PR body, or call:
curl -X POST https://api.molar.it/v1/mender/$ATTEMPT_ID/reject \
-H "Authorization: Bearer $MOLAR_API_KEY" \
-d '{"reason": "Wrong root cause — auth middleware issue, not billing"}'
Reasons feed RLHF for prompt improvement. Optional taxonomy: wrong file, wrong fix, intentional change, flake misclassified.
Accept
When a Mender PR merges, Guard stores the final diff (including human edits). The delta between Mender's original patch and merged code is high-value training signal.
Mender dashboard
At app.molar.it/dashboard/guard → Mender (or Fixbots in combined bundle):
Inbox list:
- Filter: pending review, classification, outcome, repo
- Columns: classification, confidence, scenario, PR URL, token cost, outcome
Attempt detail:
- Triage rationale + evidence attachments
- Diff preview (
GET /v1/mender/:id/preview) - Reject with reason
- Apply (suggestive → real PR)
- Budget + circuit breaker status
MCP tools for agents
Standalone Guard MCP (stdio transport) is available for self-hosted deployments:
| Tool / resource | Action |
|---|---|
molar://incidents/{orgId} | Open incidents resource |
molar://orgs/{orgId}/runs/{runId}/trace | Run trace resource |
molar://scenarios/{id}/healthcheck | Scenario health resource |
molar_rerun | Re-enqueue a failed run |
molar_mender_reject | Send RLHF rejection |
molar_ack_incident | Acknowledge incident |
molar_suppress_incident | Suppress incident |
molar_list_scenarios | List scenarios for org |
The unified 22-tool Molar MCP connector (mcp.molar.cloud) proxies additional tools via the Molar API. See Webhooks & API.
Model routing
Mender uses separate models per capability (configurable in Settings → AI):
| Capability | Typical model |
|---|---|
mender_triage | Fast text (Haiku-class) |
mender_fix_agent | Opus/Sonnet-class |
self_heal_vision | Vision-capable VLM |
Default is Molar-hosted; BYOK supported per BYOK docs.
What Mender is not
- Not a merge bot — humans approve every fix
- Not silent test healing — scenario changes always arrive as PRs
- Not a replacement for code review — patches can be wrong; confidence scores guide priority
- Not allowed on forbidden paths — payment/auth/security/migrations alert only
Next
- GitHub App setup — Mender triggers from PR failures
- Production monitoring — Mender on prod incidents
- Security — Mender governance and prompt-injection controls
- Troubleshooting — circuit breaker and budget issues