Governed

Alignment confirmed with the doctrine.

Agent Architecture Review, Validation snapshot

Evaluated 12 May 2026 against the AI Design Blueprint doctrine

Production-ready

Status: Aligned

100/100

Grade A

10 aligned

Blueprint Readiness measures doctrine alignment, not runtime correctness. A production-ready verdict means the architecture embodies the 10 principles; it does not run your tests or types. Layer it on top of your test suite, not in place of it.

Per-principle verdicts

The submission is an autonomous background validation/onboarding workflow with durable job state, explicit approval and blocker states, typed untrusted-code envelopes, audit metadata, retry paths, and abort steering. The current Iter14 changes address the prior P10 gap by adding a post-validation abort re-check before persistence/completion and preserving failed onboarding state for step-aware recovery.

Iteration history

5 prior runs on this artifact. Each run_id opens its own readiness review.

Scores can move up or down between iterations: the validator's reasoning is not strictly deterministic, so the same artifact can score differently across runs. The per-principle deltas below show the substantive change.

When	Score	Tier	Run ID
12 May 2026 (this run)	100 / A	Production-ready	3ac16b20…
12 May 2026	74 / C	Emerging	0e49f888…
12 May 2026	74 / C	Emerging	4760459e…
12 May 2026	74 / C	Emerging	9caf9385…
12 May 2026	74 / C	Emerging	093809b5…
12 May 2026	98 / A	Production-ready	4128f700…

Certified production-readyAttempt 1/3

The first-pass production_ready verdict is confirmed: the code shows durable job states, explicit terminal/blocked/aborted handling, retry paths, and no specific missed defect that would currently cause a real user silent wrong results, crash, or trust-boundary bypass.

Per-principle findings

10 principles evaluated. Verdict, severity, evidence and recommendation for each.

aligned

Design for delegation rather than direct manipulation

Delegation is represented as assignment of work rather than manual execution: `approve()` records approval and hands off to `run_cohort_validate`, while `_execute_with_job()` autonomously performs cloning, selection, bundling, validation, audit construction, and persistence. Scope and constraints are explicit through `repo_url`, `ValidationContext(repository=namespace, files=...)`, `select_agentic_surface()`, byte/file limits, and job lifecycle fields. This maintains the prior aligned verdict.

aligned

Ensure that background work remains perceptible

Background work remains perceptible through persistent `CohortValidationJob` records with `status`, `queued_at`, per-step timestamps (`cloning_started_at`, `selecting_started_at`, `bundling_started_at`, `validating_started_at`), `terminal_at`, `failure_kind`, `safe_display_message`, `retry_eligible`, and `abort_requested`. `_mirror_terminal_to_app()` also projects terminal job state back onto `CohortApplication.onboarding_state`. This maintains the prior aligned verdict.

aligned

Align feedback with the user’s level of attention

Feedback is calibrated by separating routine progress (`queued`, `cloning`, `selecting`, `bundling`, `validating`) from intervention-worthy outcomes (`blocked`, `failed`, `aborted`) and by recording concise user/operator messages in `safe_display_message` and `onboarding_failure_reason`. Diagnostic detail is kept in the audit object and `_summarize_validate_log()` rather than pushed into primary state. This maintains the prior aligned verdict.

aligned

Apply progressive disclosure to system agency

The workflow uses progressive disclosure: primary state is compact (`CohortApplication.onboarding_state`, `CohortValidationJob.status`, `failure_kind`), while deeper inspection is available in the persisted `audit_object` with `source`, `selection`, `validate`, and `job` sections. `selected_files`, `skipped_during_read`, `bundle_sha256`, `envelope_hash`, and summarized validation logs are available without overwhelming the main lifecycle state. This maintains the prior aligned verdict.

aligned

Replace implied magic with clear mental models

The system exposes a clear mental model with named lifecycle states and failure categories: `STEPS`, `TERMINAL_STATUSES`, `FAILURE_KINDS`, onboarding states such as `firebase_user_failed`, `sign_in_link_failed`, `approval_email_failed`, `validation_queued`, and `validation_complete`, plus explicit comments documenting the uncancellable single LLM-call limitation in `_execute_with_job()`. The untrusted-code boundary is also made explicit through `BOUNDARY_HEADER`, `BOUNDARY_CONTRACT_VERSION`, and `ENVELOPE_ADVISORY`. This maintains the prior aligned verdict.

aligned

Expose meaningful operational state, not internal complexity

Operational state is user/action relevant rather than raw internals: application-level state is mirrored to `validation_complete` or `validation_failed`, while job-level state uses meaningful labels like `queued`, `blocked`, `failed`, `aborted`, and `completed`. Low-level diagnostic data such as file hashes, commit SHA, selected/skipped files, and usage/log summaries is reserved for the audit payload instead of being the primary status surface. This maintains the prior aligned verdict.

aligned

Establish trust through inspectability

Inspectability is supported by a typed file envelope and audit trail: `build_file_envelope()` canonicalizes files and computes `envelope_hash`; `wrap_bundle_with_boundary()` marks the bundle as untrusted input; `_execute_with_job()` records `commit_sha`, `selected_files` with content hashes, `bundle_sha256`, `skipped_during_read`, `latency_ms`, `log_signals`, `boundary_contract_version`, and `envelope_schema` into `result_dict['audit']`. This provides traceability from validation result back to source material and selection decisions. This maintains the prior aligned verdict.

aligned

Make hand-offs, approvals, and blockers explicit

Approvals, handoffs, and blockers are explicit. `approve()` requires operator confirmation unless `--yes` is supplied, then performs discrete external handoffs to Firebase, sign-in-link generation, email, and validation, each with separate failure states (`firebase_user_failed`, `sign_in_link_failed`, `approval_email_failed`, `validation_failed`). Validation blockers are classified via `mark_blocked()` with concrete `failure_kind` values such as `invalid_repo_url`, `no_supported_language`, `selector_rejected`, and `read_failed`. This maintains the prior aligned verdict.

aligned

Represent delegated work as a system, not merely as a conversation

Delegated work is represented as a structured system rather than a conversation: `CohortValidationJob` models the workflow, dependencies, step timestamps, terminal status, retry eligibility, abort intent, selected counts, bundle hash, commit SHA, and validation run linkage. `_execute_with_job()` advances this state machine across cloning, selecting, bundling, validating, persistence, completion, and failure handling. This maintains the prior aligned verdict.

P10

aligned

Optimise for steering, not only initiating

Steering primitives are now present for both validation and onboarding recovery. `request_abort()` persists `abort_requested=True`; `_execute_with_job()` checks for abort before validation and, in the Iter14 change, refreshes the job and re-checks `job.abort_requested` immediately after `asyncio.run(validate_code_against_principles(...))` returns and before creating `UserValidationRun` or calling `mark_completed()`, discarding validator output on abort. `retry_failed_validation_job()` requires the latest job to be terminal and `retry_eligible`, while `retry_onboarding_handoff()` preserves the failed state (`firebase_user_failed`, `sign_in_link_failed`, or `approval_email_failed`) and clears…

Embed in your README

Two embeddable variants: a small flat shield and a richer score card.

Score card (recommended)

[![Blueprint Readiness Score card](https://aidesignblueprint.com/api/badge/run/3ac16b20-88b8-4448-a4f6-5aa738b2919b/card.svg)](https://aidesignblueprint.com/en/readiness-review/3ac16b20-88b8-4448-a4f6-5aa738b2919b)

Flat badge

[![Blueprint Readiness Score](https://aidesignblueprint.com/api/badge/run/3ac16b20-88b8-4448-a4f6-5aa738b2919b.svg)](https://aidesignblueprint.com/en/readiness-review/3ac16b20-88b8-4448-a4f6-5aa738b2919b)

Baseline and iteration details

Baseline: usedDoctrine: same doctrineRace: checked clear

Iteration delta

1 closed this pass0 reopened0 high-risk findings still open

Improvements (1)

P10Optimise for steering, not only initiatingneeds_changesaligned

Rubric: 2026-05-04

Run your own validation AI Design Blueprint

Run ID: 3ac16b20-88b8-4448-a4f6-5aa738b2919b · Results expire after 90 days

Run by agents. Governed by humans. Validated by the AI Design Blueprint.