Skip to main contentSkip to footer
Governed

Alignment confirmed with the doctrine.

Agent Architecture Review, Validation snapshot

Evaluated 7 May 2026 against the AI Design Blueprint doctrine

Production-ready

Status: Aligned

100/100

Grade A

10 aligned

Blueprint Readiness measures doctrine alignment, not runtime correctness. A production-ready verdict means the architecture embodies the 10 principles; it does not run your tests or types. Layer it on top of your test suite, not in place of it.

Per-principle verdicts

The submission is an autonomous governed-agent workflow with persistent run state, approval gates, steering commands, leases, and an auditable event/evidence ledger. The Iter4 changes add load-bearing primitives rather than wrappers: LeaseLost now aborts pause waits, a pre-executor lease/state fence prevents actions under stolen/reaped leases, and audit verification is bidirectional across redaction markers and evidence rows. All applicable principles are aligned.

Iteration history

1 prior run on this artifact. Each run_id opens its own readiness review.

Scores can move up or down between iterations: the validator's reasoning is not strictly deterministic, so the same artifact can score differently across runs. The per-principle deltas below show the substantive change.

WhenScoreTierRun ID
7 May 2026 (this run)100 / AProduction-readyffcc637e
7 May 202674 / CEmerging37cc23be
Certified production-readyAttempt 1/3

Confirmed: the visible Iter4 code addresses the prior high-risk seams with LeaseLost propagation during pause waits, a pre-executor lease fence, and bidirectional audit/evidence verification, and I found no specific code-cited missed production blocker.

Per-principle findings

10 principles evaluated. Verdict, severity, evidence and recommendation for each.

P1

aligned

Design for delegation rather than direct manipulation

Delegation is explicit: `GovernedActionDispatcher.dispatch` executes bounded `GovernedAction` objects through the single `perform_action` tool seam, while `GovernedPolicy.is_action_permitted` and `permitted_action_scope` constrain authority. Users/operators govern the work through durable pause/resume/abort commands drained in `GovernedRunHooks._drain_steering`, rather than manually executing each step.

P2

aligned

Ensure that background work remains perceptible

Background work remains perceptible through persistent state and heartbeat primitives: `_checkpoint` calls `heartbeat(...)` before work and emits `heartbeat` events, `_wait_for_resume` continues heartbeating while paused, and `reap_stale_runs` delegates to `reap_stale_leases` for watchdog recovery. `status()` and `timeline()` expose current state and event history after the user returns.

P?

aligned

Align feedback with the user’s level of attention

Feedback is calibrated by event type and attention need: routine progress is throttled via `heartbeat_every_n_turns`, while high-attention conditions produce explicit events such as `scope_violation`, `approval_binding_mismatch`, `tool_call_failed`, pause/resume transitions, and abort messages. The submitted unchanged CLI summary notes event de-duplication by `(kind, seq)` for attention-required announcements, preserving signal without noisy repeats.

P4

aligned

Apply progressive disclosure to system agency

The observability layer separates default status from deeper inspection: `status()` returns a compact `RunStatus` with `state`, `is_terminal`, `latest_message`, and final/failure digests, while `timeline()` exposes detailed event rows and `evidence()` exposes sidecar evidence when deeper audit is required. This is progressive disclosure rather than forcing raw ledger detail into the primary status path.

P5

aligned

Replace implied magic with clear mental models

The code replaces magic with explicit operating rules: the `perform_action` tool description lists permitted action names and states that submission-capable actions require operator approval, `ScopeViolation` makes out-of-scope execution fail closed, `ApprovalBindingMismatch` explains approval/action drift, and `on_handoff` transitions to `FAILED` with `reason_kind: unsupported_handoff`.

P6

aligned

Expose meaningful operational state, not internal complexity

Operational state is represented with user-relevant run states rather than only internal mechanics: transitions use `RunState.IN_PROGRESS`, `PAUSED`, `ABORTED_BY_USER`, and `FAILED`, while `status()` returns `state`, `is_terminal`, and `latest_message`. Lower-level details such as `entry_hash`, evidence digests, and raw sidecar rows are reserved for `timeline()`, `evidence()`, and `verify_audit()`.

P7

aligned

Establish trust through inspectability

Inspectability is backed by concrete audit primitives. `verify_audit` replays the hash chain using `prev_hash` and `_recompute_entry_hash`, validates genesis `policy_digest` against `runs.policy_digest`, recomputes `sha256(ev.value_json)` for each evidence row, requires marker digest agreement, derives `required_markers` by scanning every redacted marker in every `event.data`, and rejects unless `seen_markers == set(required_markers.keys())`. Delta: this improves the prior P7 finding by adding the bidirectional marker/evidence reconciliation that was previously missing.

P8

aligned

Make hand-offs, approvals, and blockers explicit

Approvals, handoffs, and blockers are explicit. `evaluate_approval` and `wait_for_decision` gate governed actions, `compute_action_digest` binds the approved action to the executable action, and `ApprovalBindingMismatch` stops drift. Unsupported handoffs are not silently followed: `on_handoff` records a `FAILED` transition with from/to agent names and raises `UnsupportedHandoff`. Scope and lease failures emit typed events such as `scope_violation` and `lease_fence_failed`.

P9

aligned

Represent delegated work as a system, not merely as a conversation

Delegated work is modeled as a system: the submission uses persistent `run_id` state, a run ledger, `state_transition` events, evidence sidecars, leases, watchdog reaping, steering commands, and a single governed tool seam. `timeline()` returns structured event records with `seq`, `kind`, `state`, `message`, `data`, and `entry_hash`, so execution is inspectable as stateful orchestration rather than merely as a chat transcript.

P10

aligned

Optimise for steering, not only initiating

Steering is supported by durable safe-boundary controls. `GovernedRunHooks._drain_steering` handles `abort` and `pause`, `_wait_for_resume` accepts `resume` or `abort` while continuing heartbeats, and transitions use `expected_worker_id` to prevent phantom writes. The Iter4 changes close the previous steering race: `LeaseLost` in `_wait_for_resume` now raises `AbortRequested` instead of returning silently, and `GovernedActionDispatcher.dispatch` performs a `heartbeat(...)` lease/state fence immediately before `await self._executor(action)`, recording `lease_fence_failed` and raising `ScopeViolation` if the lease is invalid. Delta: this improves the prior P10 finding by addressing both the sw…

Embed in your README

Two embeddable variants: a small flat shield and a richer score card.

Score card (recommended)

Blueprint Readiness Score card
[![Blueprint Readiness Score card](https://aidesignblueprint.com/api/badge/run/ffcc637e-1280-4f34-8fed-57c7359d7466/card.svg)](https://aidesignblueprint.com/en/readiness-review/ffcc637e-1280-4f34-8fed-57c7359d7466)

Flat badge

Blueprint Readiness Score badge
[![Blueprint Readiness Score](https://aidesignblueprint.com/api/badge/run/ffcc637e-1280-4f34-8fed-57c7359d7466.svg)](https://aidesignblueprint.com/en/readiness-review/ffcc637e-1280-4f34-8fed-57c7359d7466)
Baseline and iteration details
Baseline: usedDoctrine: same doctrineRace: checked clear

Iteration delta

2 closed this pass0 reopened0 high-risk findings still open

Improvements (2)

P7Establish trust through inspectabilityneeds_changesaligned
P10Optimise for steering, not only initiatingneeds_changesaligned
Rubric: 2026-05-04

Run ID: ffcc637e-1280-4f34-8fed-57c7359d7466 · Results expire after 90 days

Run by agents. Governed by humans. Validated by the AI Design Blueprint.