From silent transfer fabrication to cert-confirmed in ten iterations
A 156-line Python script drafted bank transfers, swallowed errors, and could mark transfers SUCCEEDED without ever talking to a real bank. Ten iterations of architect.validate + four architect.certify second-pass reviews on the production MCP. Validate is the first-pass principle-by-principle review; certify is a second-pass adversarial reviewer that runs only after validate aligns and looks for production blockers the principle pass couldn't see. The cert reviewer caught a different load-bearing failure mode every time, false-COMPLETE via still-AWAITING tasks, false-COMPLETE despite stale FAILED, stubbed bank in audit, money-at-risk projection error, blocker not durable across passes. Iter 10 is the first run where validate AND cert both pass cleanly.
Key Facts
- Validator iterations
- 10 prod-MCP runs
- Cert calls
- 4 second-pass reviews
- Score trajectory
- 0/F → 100/A
- Cert outcome
- confirmed_production_ready
- Doctrine compliance
- 10 / 10 principles aligned
Validator + cert trajectory
Ten iterations, every run ID public
Each iteration is a real prod-MCP architect.validate call. Iter 4, 5, 6 then took architect.certify second-pass reviews and each downgraded the score from 100/A to 74/C/emerging because the cert reviewer caught a different production-blocking failure mode on each pass. Iter 7-9 then regressed on validate as the cert-layer fixes unmasked deeper gaps. Iter 10 is the first run where validate AND cert both pass cleanly, the cert reviewer found zero missed blockers.
Cert mandate value
Each cert call caught what the prior fix unmasked
First-pass validate alone returned aligned three times in a row (Iter 4-6). Three times in a row the second-pass cert reviewer surfaced a production-blocking failure mode the first-pass review couldn't see. Each fix to the cert finding triggered a NEW failure mode that only became reachable because the prior fix existed. This is the load-bearing argument for layered second-pass certification.
Principle scorecard
Every flagged principle, Iter1 vs Iter10, at a glance
Five principles across the trajectory fired as production_blocker / high_risk on at least one validate or cert pass; the Iter10 run closes all of them. The narrative below walks through each, this table is the scannable summary.
Refactor scope
Iter1 ungoverned vs Iter10 cert-confirmed
Numbers verbatim from the package source. The original 156-line script had no audit, no run_id, no approval, no idempotency, no recovery. Iter 10 is ~2,384 lines because the cert reviewer surfaced a different load-bearing failure each cert call, every line earned its place against a specific finding.
Before / After
Validator output
What the cert reviewer found
The Blueprint MCP ran architect.validate ten times and architect.certify four times against this code base. Five principle-level production blockers were identified across the trajectory, each one a path for an irreversible bank transfer to fire under conditions the operator never authorised.
How each blocker was resolved
What the iterations fixed
Each iteration closed at least one production blocker. Iter 10 is the first run where every flagged principle reaches aligned and the second-pass cert reviewer finds zero missed blockers.
Re-validation result
Iter 10: architect.certify confirmed production_ready
Iter 10's implementation was re-validated to 100/A/production_ready, then certified in the same prod-MCP session. The cert outcome was confirmed_production_ready, zero certification_findings. Verbatim summary from the cert reviewer: "the code visibly implements durable reconciliation blocking for ambiguous bank handoffs, explicit cancellation-pending states, signed approval/hash checks, audit/inbox inspectability, and no specific missed production-blocking crash, silent wrong-result path, or trust-boundary bypass is evidenced."
Calculated ROI
Same metrics, same calculator powering every case study
Derived deterministically from this case study's profile (10 iterations, irreversible-financial blast radius, autonomous workflow, under compliance) via /lib/case-study-roi.ts. Numbers directly comparable to the other case studies.