Skip to main contentSkip to footer
Reviewed

Assessment complete; awaiting evidence revision.

Agent Architecture Review, Validation snapshot

Evaluated 12 May 2026 against the AI Design Blueprint doctrine

Emerging

Status: High Risk

74/100

Grade C

9 aligned1 production blockers
Per-principle verdicts

This is an autonomous background orchestration surface with strong persistent job state, explicit approval/blocker handling, and a reproducible audit envelope. Most principles are aligned, but P10 still needs changes: an abort requested during the validating step is acknowledged yet can be ignored, and the retry-onboarding command can erase the failed-step state that approve() needs for safe step-aware recovery.

Iteration history

5 prior runs on this artifact. Each run_id opens its own readiness review.

WhenScoreStatusRun ID
12 May 2026 (this run)74 / CHigh Risk0e49f888
12 May 202674 / CHigh Risk4760459e
12 May 202674 / CHigh Risk9caf9385
12 May 202674 / CAligned093809b5
12 May 202698 / AAligned4128f700
12 May 202698 / AAligned270e7ca6

Per-principle findings

10 principles evaluated. Verdict, severity, evidence and recommendation for each.

P0

needs changesproduction blocker60/100

Optimise for steering, not only initiating

The current steering boundary still fails in two concrete paths. First, request_abort() sets abort_requested=True and prints success for any non-terminal job, but _execute_with_job only checks abort_requested before mark_step_started(db, job, "validating"); if an operator aborts while validate_code_against_principles is running, the code proceeds to persist UserValidationRun, set app.validation_run_id, and call mark_completed without refreshing the job or honoring the abort. Second, retry_onboarding_handoff() overwrites app.onboarding_state with "pending", while approve() relies on the captured prior_state being "sign_in_link_failed" or "approval_email_failed" to skip Firebase; using the doc…

Recommendation

Move retry and abort authority into a small transactional state-transition boundary. Preserve the failed onboarding step as durable state instead of overwriting it with pending, and add a hard abort check after the validator returns but before persisting or completing the job, or pass a cancellable run token into the worker so abort_requested terminalizes the job as aborted rather than completing it.

P0

aligned

Design for delegation rather than direct manipulation

Delegation is modeled as assigned work rather than manual execution: approve() gates the applicant handoff, then run_cohort_validate.run creates a persistent CohortValidationJob and _execute_with_job performs cloning, selecting, bundling, validating, and persistence. The user/operator supplies intent through application fields such as repo_url and approval state while the workflow executes the operational steps.

P0

aligned

Ensure that background work remains perceptible

Background work is made perceptible through durable CohortValidationJob fields: status, queued_at, cloning_started_at, selecting_started_at, bundling_started_at, validating_started_at, terminal_at, failure_kind, retry_eligible, and abort_requested. _mirror_terminal_to_app also maps terminal job outcomes back to CohortApplication.onboarding_state so the applicant-level record remains inspectable after the worker exits.

P0

aligned

Align feedback with the user’s level of attention

Feedback is calibrated by layer: routine progress is represented by concise states such as validation_queued, cloning, validating, completed, blocked, failed, and aborted, while elevated attention paths carry failure_kind, safe_display_message, retry_eligible, and onboarding_failure_reason. _summarize_validate_log deliberately compresses validator logs into high-level signals instead of exposing noisy internals by default.

P0

aligned

Apply progressive disclosure to system agency

The code uses progressive disclosure: primary flow state is exposed through CohortApplication.onboarding_state and CohortValidationJob.status, while deeper evidence is stored in audit_object under source, selection, validate, and job. The detailed result is persisted in UserValidationRun.result_json, and _summarize_validate_log avoids dumping raw validation logs into the main status path.

P0

aligned

Replace implied magic with clear mental models

The workflow gives a clear mental model through explicit state and boundary names: TERMINAL_STATUSES, FAILURE_KINDS, onboarding states such as firebase_user_failed and approval_email_failed, and operator messages in retry_failed_validation_job explaining non-terminal and non-retryable conditions. The validation bundle is also labeled with BOUNDARY_HEADER and ENVELOPE_ADVISORY, making the untrusted-code boundary explicit.

P0

aligned

Expose meaningful operational state, not internal complexity

Operational state is expressed in actionable terms rather than raw implementation detail: queued, cloning, selecting, bundling, validating, completed, blocked, failed, and aborted. mark_blocked, mark_failed, and _mirror_terminal_to_app translate tool and dependency failures into failure_kind, safe_display_message, retry_eligible, and onboarding_failure_reason fields that support operator action.

P0

aligned

Establish trust through inspectability

Inspectability is supported by a typed reproducibility envelope and audit trail. build_file_envelope records envelope_schema, boundary_contract, advisory, file metadata, and envelope_hash; _build_implementation_context records each selected file's path, byte_size, and sha256; audit_object stores commit_sha, selected_files, skipped_during_read, bundle_sha256, envelope_hash, validator latency, log_signals, and job_id in UserValidationRun.result_json.

P0

aligned

Make hand-offs, approvals, and blockers explicit

Approvals, handoffs, and blockers are explicit in the main lifecycle. approve() requires an affirmative confirmation unless the operator passes --yes, reject() records a rejection_reason, external handoff failures are typed as firebase_user_failed, sign_in_link_failed, or approval_email_failed, and validation blockers are persisted through mark_blocked with failure_kind and safe_display_message. retry_failed_validation_job also refuses non-terminal or non-retryable jobs with a specific operator-facing explanation.

P0

aligned

Represent delegated work as a system, not merely as a conversation

Delegated work is represented as a structured system rather than a message stream. CohortValidationJob links application_id, user_id, repo_url, status, step timestamps, retry_count, abort_requested, validation_run_local_id, commit_sha, bundle_sha256, and selection counts; the execution loop is separated from the persisted UserValidationRun result and audit metadata.

Embed in your README

Two embeddable variants: a small flat shield and a richer score card.

Score card (recommended)

Blueprint Readiness Score card
[![Blueprint Readiness Score card](https://aidesignblueprint.com/api/badge/run/0e49f888-1b71-4cb2-bb86-52327681b997/card.svg)](https://aidesignblueprint.com/en/readiness-review/0e49f888-1b71-4cb2-bb86-52327681b997)

Flat badge

Blueprint Readiness Score badge
[![Blueprint Readiness Score](https://aidesignblueprint.com/api/badge/run/0e49f888-1b71-4cb2-bb86-52327681b997.svg)](https://aidesignblueprint.com/en/readiness-review/0e49f888-1b71-4cb2-bb86-52327681b997)
Baseline and iteration details
Baseline: usedDoctrine: same doctrineRace: checked clear
Rubric: 2026-05-04Grade limited by 0 high-risk findings

Run ID: 0e49f888-1b71-4cb2-bb86-52327681b997 · Results expire after 90 days

Run by agents. Governed by humans. Validated by the AI Design Blueprint.