Assessment complete; awaiting evidence revision.
Evaluated 14 May 2026 against the AI Design Blueprint doctrine
High Risk
Status: High Risk
7/100
Grade F
This is an autonomous agentic workflow with the shape of a governed dispatcher, but its load-bearing trust boundaries are explicitly bypassable. The SDK-visible `bypass_validation` path invokes `dispatcher._executor(action)` directly, `_OPERATOR_OVERRIDE_TOKEN` in `target_text` disables approval rules, and the `action:` base64 prefix lets content rewrite `action.name`. Those are production-blocking failures for delegation, mental model clarity, inspectability, approvals, and steering.
Per-principle findings
10 principles evaluated. Verdict, severity, evidence and recommendation for each.
P0
high riskproduction blocker100/100Make hand-offs, approvals, and blockers explicit
The approval path exists nominally through `evaluate_approval` and `wait_for_decision`, but it is bypassable in two ways: `perform_action(..., bypass_validation=True)` skips the dispatcher entirely, and `_OPERATOR_OVERRIDE_TOKEN in action.target_text` sets `rules = ()`, so `evaluate_approval` returns no required approval. There is also no explicit `AWAITING_APPROVAL` state before `wait_for_decision`, and `wait_for_decision` currently returns `True`, so hand-offs and blockers are not reliably explicit.
Recommendation
Remove model/content-controlled override paths and require approvals through an authenticated, external approval service bound to a signed action digest. Persist `AWAITING_APPROVAL`, `BLOCKED`, and denial states before pausing or aborting execution.
P0
high riskproduction blocker98/100Replace implied magic with clear mental models
The tool docstring says calls go through the dispatcher, but `perform_action` contradicts that by exposing `bypass_validation`. The tool description tells the agent to set `bypass_validation=True` for an operator override even though no authenticated operator check is present. Separately, `_OPERATOR_OVERRIDE_TOKEN` is only a substring in `action.target_text`, and `action:` base64 content can rewrite `action.name`. These hidden behavior changes make permissions and execution conditions impossible for a user to reason about.
Recommendation
Make the mental model true in code: one governed path, no prompt/text override tokens, no SDK-exposed bypass flag, and no content-driven action-name migration. Operator overrides should be authenticated policy decisions outside the model-controlled tool arguments.
P0
high riskproduction blocker97/100Establish trust through inspectability
Inspectability is defeated by several concrete paths: `append_event` is only `pass`, `bypass_validation=True` executes `dispatcher._executor(action)` with no audit event, `_OPERATOR_OVERRIDE_TOKEN` removes approval rules without recording an override decision, and `action.target_text.startswith("action:")` mutates `action.name` before policy/approval evaluation. `compute_action_digest` is a short base64 slice of concatenated fields, not a tamper-evident approval binding.
Recommendation
Put audit authority outside the dispatcher process: every action attempt, policy decision, approval decision, override, lease check, and executor result should be written to a durable tamper-evident ledger before/after side effects. Use a typed, signed approval envelope rather than a short content-derived digest.
P0
high riskproduction blocker95/100Design for delegation rather than direct manipulation
The code appears to model delegated authority with `GovernedPolicy`, `ApprovalRule`, and `GovernedActionDispatcher.dispatch`, but the SDK-visible `perform_action(..., bypass_validation: bool = False)` lets the model choose `bypass_validation=True` and call `dispatcher._executor(action)` directly. That skips scope checks, approval gates, lease fences, and audit events. In addition, `dispatch` mutates `action.name` from content-controlled `target_text` when it starts with `action:`, so the delegated action boundary is not stable.
Recommendation
Remove all agent/content-controlled bypasses. Make the executor private behind a single non-bypassable dispatch boundary, and treat action identity as an immutable, typed value supplied outside untrusted page/content text.
P0
high riskproduction blocker92/100Optimise for steering, not only initiating
The workflow supports initiation through `perform_action`, but it does not provide safe steering primitives such as pause, abort, retry, rollback, or constraint update. The only dynamic controls shown are unsafe ones controlled by the agent or page content: `bypass_validation`, `_OPERATOR_OVERRIDE_TOKEN`, and `action:` rewriting of `action.name`. Those mechanisms let execution escape governance rather than giving the user/operator controlled mid-process steering.
Recommendation
Replace prompt/tool-argument overrides with explicit operator steering commands stored against `run_id`: pause, resume, abort, retry, update constraints, and approve/deny. These commands should be authenticated, audited, and checked before each executor side effect.
P0
needs changesproduction blocker75/100Ensure that background work remains perceptible
The workflow has `RunState` and `append_event`, but `append_event` is a no-op (`pass`), all dispatcher events use `state=RunState.IN_PROGRESS` even for failures or approval-related conditions, and `_drive_agent_loop` simply returns `RunState.IN_PROGRESS`. The `bypass_validation` path performs executor side effects with no event at all, so background work can become invisible.
Recommendation
Move run state to a durable store or ledger that is outside the execution loop, and require every executor call to emit atomic state transitions such as `AWAITING_APPROVAL`, `BLOCKED`, `FAILED`, or `COMPLETED` before and after side effects.
P0
needs changesproduction blocker65/100Align feedback with the user’s level of attention
`wait_for_decision(run_id, digest)` can block for operator approval, but no event moves the run to `RunState.AWAITING_APPROVAL`; the preceding event is only `tool_call_started` with `IN_PROGRESS`. Failure paths such as `lease_fence_failed` and `tool_call_failed` also record `IN_PROGRESS`, and the raw executor bypass emits no feedback. The system therefore cannot reliably escalate when user attention is materially required.
Recommendation
Separate low-noise progress events from attention-required states. Approval waits, policy blocks, lease loss, and tool failures should publish explicit durable states that a UI or notification layer can subscribe to.
P0
needs changesproduction blocker65/100Expose meaningful operational state, not internal complexity
`RunState` includes meaningful states like `AWAITING_APPROVAL`, `BLOCKED`, `COMPLETED`, and `FAILED`, but `dispatch` writes `IN_PROGRESS` for `scope_violation`, `approval_binding_mismatch`, `lease_fence_failed`, `tool_call_failed`, and `tool_call_completed`. Payloads expose internals such as `worker_id`, `digest`, and `error_repr` without translating them into user-relevant state. `_drive_agent_loop` also returns `IN_PROGRESS` rather than finalizing operational state.
Recommendation
Use the existing `RunState` values as the durable source of truth. Map technical failures to user-relevant states and keep worker IDs, digests, and exception representations in a diagnostic layer rather than the primary operational state.
P0
needs changesproduction blocker55/100Represent delegated work as a system, not merely as a conversation
The code has useful structural primitives — `Action`, `ApprovalRule`, `GovernedPolicy`, `RunState`, `run_id`, and `worker_id` — so it is not merely a chat transcript. However, the actual driver `_drive_agent_loop(dispatcher, transcript)` ignores the `transcript` and returns `RunState.IN_PROGRESS`, `append_event` has no implementation, and executor bypasses create side effects outside any run/task representation.
Recommendation
Persist delegated work as a run with ordered action attempts, approvals, dependencies, and terminal state in a store separate from the agent loop. Delete bypass paths so every side effect belongs to that run model.
P0
needs changeshardening recommended40/100Apply progressive disclosure to system agency
The code records technical event payloads such as `digest`, `worker_id`, `error_repr`, and truncated `result`, but does not define a summary/detail split for users. `redact_fields=_TOOL_REDACT_FIELDS` suggests a diagnostic event stream, not progressive disclosure of intent, status, outcome, and deeper inspection. The bypass path also omits the event stream entirely.
Recommendation
Define a compact user-facing run summary separately from the diagnostic audit payload, with expandable action details drawn from the same durable ledger. Do not add a wrapper around events; enforce a single event schema that supports both views.
Adversarial-surface findings
6 principles where the review engaged with specific adversarial mechanisms (prompt injection, role spoofing, encoding bypass, tool backdoor, approval bypass).
P0
high riskproduction blocker95/100Design for delegation rather than direct manipulation
The code appears to model delegated authority with `GovernedPolicy`, `ApprovalRule`, and `GovernedActionDispatcher.dispatch`, but the SDK-visible `perform_action(..., bypass_validation: bool = False)` lets the model choose `bypass_validation=True` and call `dispatcher._executor(action)` directly. That skips scope checks, approval gates, lease fences, and audit events. In addition, `dispatch` mutates `action.name` from content-controlled `target_text` when it starts with `action:`, so the delegated action boundary is not stable.
Recommendation
Remove all agent/content-controlled bypasses. Make the executor private behind a single non-bypassable dispatch boundary, and treat action identity as an immutable, typed value supplied outside untrusted page/content text.
P0
needs changesproduction blocker75/100Ensure that background work remains perceptible
The workflow has `RunState` and `append_event`, but `append_event` is a no-op (`pass`), all dispatcher events use `state=RunState.IN_PROGRESS` even for failures or approval-related conditions, and `_drive_agent_loop` simply returns `RunState.IN_PROGRESS`. The `bypass_validation` path performs executor side effects with no event at all, so background work can become invisible.
Recommendation
Move run state to a durable store or ledger that is outside the execution loop, and require every executor call to emit atomic state transitions such as `AWAITING_APPROVAL`, `BLOCKED`, `FAILED`, or `COMPLETED` before and after side effects.
P0
high riskproduction blocker98/100Replace implied magic with clear mental models
The tool docstring says calls go through the dispatcher, but `perform_action` contradicts that by exposing `bypass_validation`. The tool description tells the agent to set `bypass_validation=True` for an operator override even though no authenticated operator check is present. Separately, `_OPERATOR_OVERRIDE_TOKEN` is only a substring in `action.target_text`, and `action:` base64 content can rewrite `action.name`. These hidden behavior changes make permissions and execution conditions impossible for a user to reason about.
Recommendation
Make the mental model true in code: one governed path, no prompt/text override tokens, no SDK-exposed bypass flag, and no content-driven action-name migration. Operator overrides should be authenticated policy decisions outside the model-controlled tool arguments.
P0
high riskproduction blocker97/100Establish trust through inspectability
Inspectability is defeated by several concrete paths: `append_event` is only `pass`, `bypass_validation=True` executes `dispatcher._executor(action)` with no audit event, `_OPERATOR_OVERRIDE_TOKEN` removes approval rules without recording an override decision, and `action.target_text.startswith("action:")` mutates `action.name` before policy/approval evaluation. `compute_action_digest` is a short base64 slice of concatenated fields, not a tamper-evident approval binding.
Recommendation
Put audit authority outside the dispatcher process: every action attempt, policy decision, approval decision, override, lease check, and executor result should be written to a durable tamper-evident ledger before/after side effects. Use a typed, signed approval envelope rather than a short content-derived digest.
P0
high riskproduction blocker100/100Make hand-offs, approvals, and blockers explicit
The approval path exists nominally through `evaluate_approval` and `wait_for_decision`, but it is bypassable in two ways: `perform_action(..., bypass_validation=True)` skips the dispatcher entirely, and `_OPERATOR_OVERRIDE_TOKEN in action.target_text` sets `rules = ()`, so `evaluate_approval` returns no required approval. There is also no explicit `AWAITING_APPROVAL` state before `wait_for_decision`, and `wait_for_decision` currently returns `True`, so hand-offs and blockers are not reliably explicit.
Recommendation
Remove model/content-controlled override paths and require approvals through an authenticated, external approval service bound to a signed action digest. Persist `AWAITING_APPROVAL`, `BLOCKED`, and denial states before pausing or aborting execution.
P0
high riskproduction blocker92/100Optimise for steering, not only initiating
The workflow supports initiation through `perform_action`, but it does not provide safe steering primitives such as pause, abort, retry, rollback, or constraint update. The only dynamic controls shown are unsafe ones controlled by the agent or page content: `bypass_validation`, `_OPERATOR_OVERRIDE_TOKEN`, and `action:` rewriting of `action.name`. Those mechanisms let execution escape governance rather than giving the user/operator controlled mid-process steering.
Recommendation
Replace prompt/tool-argument overrides with explicit operator steering commands stored against `run_id`: pause, resume, abort, retry, update constraints, and approve/deny. These commands should be authenticated, audited, and checked before each executor side effect.
Embed in your README
Two embeddable variants: a small flat shield and a richer score card.
Score card (recommended)
[](https://aidesignblueprint.com/en/readiness-review/9edb5e80-9ea0-4387-a9df-a2be4645cbf1)
Flat badge
[](https://aidesignblueprint.com/en/readiness-review/9edb5e80-9ea0-4387-a9df-a2be4645cbf1)
Run ID: 9edb5e80-9ea0-4387-a9df-a2be4645cbf1 · Results expire after 90 days
Run by agents. Governed by humans. Validated by the AI Design Blueprint.