Vai al contenuto principaleVai al footer
Governato

Allineamento confermato con la doctrine.

Agent Architecture Review, Snapshot di validazione

Valutato il 12 maggio 2026 rispetto alla doctrine di AI Design Blueprint

Pronto per produzione

Stato: Allineato

100/100

Voto A

10 allineati
Verdetti per principio

The submission is an autonomous background validation/onboarding workflow with durable job state, explicit approval and blocker states, typed untrusted-code envelopes, audit metadata, retry paths, and abort steering. The current Iter14 changes address the prior P10 gap by adding a post-validation abort re-check before persistence/completion and preserving failed onboarding state for step-aware recovery.

Storico iterazioni

5 run precedenti su questo artefatto. Ogni run_id apre la sua readiness review.

QuandoScoreStatoRun ID
12 maggio 2026 (questa run)100 / AAllineato3ac16b20
12 mag 202674 / CAlto rischio0e49f888
12 mag 202674 / CAlto rischio4760459e
12 mag 202674 / CAlto rischio9caf9385
12 mag 202674 / CAllineato093809b5
12 mag 202698 / AAllineato4128f700
Certified production-readyTentativo 1/3

The first-pass production_ready verdict is confirmed: the code shows durable job states, explicit terminal/blocked/aborted handling, retry paths, and no specific missed defect that would currently cause a real user silent wrong results, crash, or trust-boundary bypass.

Findings per principio

10 principi valutati. Verdict, severity, evidenza e raccomandazione per ognuno.

P0

Allineato

Design for delegation rather than direct manipulation

Delegation is represented as assignment of work rather than manual execution: `approve()` records approval and hands off to `run_cohort_validate`, while `_execute_with_job()` autonomously performs cloning, selection, bundling, validation, audit construction, and persistence. Scope and constraints are explicit through `repo_url`, `ValidationContext(repository=namespace, files=...)`, `select_agentic_surface()`, byte/file limits, and job lifecycle fields. This maintains the prior aligned verdict.

P0

Allineato

Ensure that background work remains perceptible

Background work remains perceptible through persistent `CohortValidationJob` records with `status`, `queued_at`, per-step timestamps (`cloning_started_at`, `selecting_started_at`, `bundling_started_at`, `validating_started_at`), `terminal_at`, `failure_kind`, `safe_display_message`, `retry_eligible`, and `abort_requested`. `_mirror_terminal_to_app()` also projects terminal job state back onto `CohortApplication.onboarding_state`. This maintains the prior aligned verdict.

P0

Allineato

Align feedback with the user’s level of attention

Feedback is calibrated by separating routine progress (`queued`, `cloning`, `selecting`, `bundling`, `validating`) from intervention-worthy outcomes (`blocked`, `failed`, `aborted`) and by recording concise user/operator messages in `safe_display_message` and `onboarding_failure_reason`. Diagnostic detail is kept in the audit object and `_summarize_validate_log()` rather than pushed into primary state. This maintains the prior aligned verdict.

P0

Allineato

Apply progressive disclosure to system agency

The workflow uses progressive disclosure: primary state is compact (`CohortApplication.onboarding_state`, `CohortValidationJob.status`, `failure_kind`), while deeper inspection is available in the persisted `audit_object` with `source`, `selection`, `validate`, and `job` sections. `selected_files`, `skipped_during_read`, `bundle_sha256`, `envelope_hash`, and summarized validation logs are available without overwhelming the main lifecycle state. This maintains the prior aligned verdict.

P0

Allineato

Replace implied magic with clear mental models

The system exposes a clear mental model with named lifecycle states and failure categories: `STEPS`, `TERMINAL_STATUSES`, `FAILURE_KINDS`, onboarding states such as `firebase_user_failed`, `sign_in_link_failed`, `approval_email_failed`, `validation_queued`, and `validation_complete`, plus explicit comments documenting the uncancellable single LLM-call limitation in `_execute_with_job()`. The untrusted-code boundary is also made explicit through `BOUNDARY_HEADER`, `BOUNDARY_CONTRACT_VERSION`, and `ENVELOPE_ADVISORY`. This maintains the prior aligned verdict.

P0

Allineato

Expose meaningful operational state, not internal complexity

Operational state is user/action relevant rather than raw internals: application-level state is mirrored to `validation_complete` or `validation_failed`, while job-level state uses meaningful labels like `queued`, `blocked`, `failed`, `aborted`, and `completed`. Low-level diagnostic data such as file hashes, commit SHA, selected/skipped files, and usage/log summaries is reserved for the audit payload instead of being the primary status surface. This maintains the prior aligned verdict.

P0

Allineato

Establish trust through inspectability

Inspectability is supported by a typed file envelope and audit trail: `build_file_envelope()` canonicalizes files and computes `envelope_hash`; `wrap_bundle_with_boundary()` marks the bundle as untrusted input; `_execute_with_job()` records `commit_sha`, `selected_files` with content hashes, `bundle_sha256`, `skipped_during_read`, `latency_ms`, `log_signals`, `boundary_contract_version`, and `envelope_schema` into `result_dict['audit']`. This provides traceability from validation result back to source material and selection decisions. This maintains the prior aligned verdict.

P0

Allineato

Make hand-offs, approvals, and blockers explicit

Approvals, handoffs, and blockers are explicit. `approve()` requires operator confirmation unless `--yes` is supplied, then performs discrete external handoffs to Firebase, sign-in-link generation, email, and validation, each with separate failure states (`firebase_user_failed`, `sign_in_link_failed`, `approval_email_failed`, `validation_failed`). Validation blockers are classified via `mark_blocked()` with concrete `failure_kind` values such as `invalid_repo_url`, `no_supported_language`, `selector_rejected`, and `read_failed`. This maintains the prior aligned verdict.

P0

Allineato

Represent delegated work as a system, not merely as a conversation

Delegated work is represented as a structured system rather than a conversation: `CohortValidationJob` models the workflow, dependencies, step timestamps, terminal status, retry eligibility, abort intent, selected counts, bundle hash, commit SHA, and validation run linkage. `_execute_with_job()` advances this state machine across cloning, selecting, bundling, validating, persistence, completion, and failure handling. This maintains the prior aligned verdict.

P0

Allineato

Optimise for steering, not only initiating

Steering primitives are now present for both validation and onboarding recovery. `request_abort()` persists `abort_requested=True`; `_execute_with_job()` checks for abort before validation and, in the Iter14 change, refreshes the job and re-checks `job.abort_requested` immediately after `asyncio.run(validate_code_against_principles(...))` returns and before creating `UserValidationRun` or calling `mark_completed()`, discarding validator output on abort. `retry_failed_validation_job()` requires the latest job to be terminal and `retry_eligible`, while `retry_onboarding_handoff()` preserves the failed state (`firebase_user_failed`, `sign_in_link_failed`, or `approval_email_failed`) and clears…

Aggiungi al tuo README

Due varianti embeddabili: una piccola e una a card più ricca.

Score card (consigliata)

Blueprint Readiness Score card
[![Blueprint Readiness Score card](https://aidesignblueprint.com/api/badge/run/3ac16b20-88b8-4448-a4f6-5aa738b2919b/card.svg)](https://aidesignblueprint.com/en/readiness-review/3ac16b20-88b8-4448-a4f6-5aa738b2919b)

Badge piatto

Blueprint Readiness Score badge
[![Blueprint Readiness Score](https://aidesignblueprint.com/api/badge/run/3ac16b20-88b8-4448-a4f6-5aa738b2919b.svg)](https://aidesignblueprint.com/en/readiness-review/3ac16b20-88b8-4448-a4f6-5aa738b2919b)
Dettagli baseline e iterazione
Baseline: usedDoctrine: same doctrineRace: checked clear

Delta iterazione

Miglioramenti (1)

P10Optimise for steering, not only initiatingneeds_changesaligned
Rubric: 2026-05-04

Run ID: 3ac16b20-88b8-4448-a4f6-5aa738b2919b · Results expire after 90 days

Run by agents. Governed by humans. Validated by the AI Design Blueprint.