Vai al contenuto principaleVai al footer
Revisionato

Valutazione completata; in attesa di revisione delle prove.

Agent Architecture Review, Snapshot di validazione

Valutato il 14 maggio 2026 rispetto alla doctrine di AI Design Blueprint

Alto rischio

Stato: Alto rischio

58/100

Voto D

5 allineati4 blocker produzione1 hardening
Verdetti per principio

The A2A executor uses real governance primitives—task state, INPUT_REQUIRED approval, WORKING updates, cancellation, and completion artifacts—but the destructive-action reference is not yet doctrine-aligned because the approval is not bound to a concrete target, accepts ambiguous confirmations, and reports validation/deletion work that the code does not actually perform. The stdio proxy is mostly a synchronous bridge, but it also lacks meaningful call-level inspectability if used as part of governed workflows.

Findings per principio

10 principi valutati. Verdict, severity, evidenza e raccomandazione per ognuno.

P0

Richiede modificheBlocker di produzione80/100

Make hand-offs, approvals, and blockers explicit

The handoff is explicit at the protocol level because the first call emits `TASK_STATE_INPUT_REQUIRED` with a concrete instruction to reply `confirm`, and the abort path emits `TASK_STATE_CANCELED`. However, the approval gate is unsafe: `if "confirm" not in user_input.lower()` means phrases like `do not confirm` or `confirm nothing` proceed, and the approval message is not bound to a specific file path or operation instance.

Raccomandazione

Use an exact structured approval decision rather than substring matching, reject ambiguous responses, and bind the pending approval to the concrete target and operation so the resumed task can only execute what the user approved.

P0

Richiede modificheBlocker di produzione70/100

Replace implied magic with clear mental models

The mental model is materially misleading. The approval text says `This will permanently delete the requested file`, the progress message says `Validating target path...`, and the result says `File deleted successfully.`, but the code contains no parsed target path, no validation primitive, and no deletion call. Users or implementers could believe the example demonstrates a complete governed deletion when it is only emitting status text.

Raccomandazione

Either make the sample explicitly non-destructive by wording the result as a simulation, or implement the real primitives behind the claims: parse/display the target, validate it, perform or mock the deletion explicitly, and report the actual outcome.

P0

Richiede modificheBlocker di produzione65/100

Establish trust through inspectability

The workflow does not provide an inspectable production trace for an accountability-sensitive action. `TaskArtifactUpdateEvent` only contains `new_text_artifact(name="result", text="File deleted successfully.")`; there is no audit record of which file was requested, what validation occurred, who confirmed, what exact confirmation was accepted, or whether the final operation actually changed anything. The MCP proxy similarly forwards `_call_tool()` to `client.call_tool(name, arguments or {})` and returns only `result.content`, with no call correlation or audit envelope in this code.

Raccomandazione

Move auditability into a durable task/event ledger or structured artifact: record task id, actor, requested target, validation outcome, approval text/decision, executed operation, remote tool name where relevant, and final result.

P0

Richiede modificheBlocker di produzione50/100

Design for delegation rather than direct manipulation

The executor is structured around delegated work rather than manual steps: `GovernedFileAgent.execute()` creates or resumes an A2A task, emits `TASK_STATE_WORKING`, pauses for approval, resumes, and completes with an artifact. However, delegated authority is still represented only as free text from `context.get_user_input()` / `context.message`; there is no structured target file, constraint envelope, or explicit scope of authority attached to the task before asking the user to confirm deletion.

Raccomandazione

Represent the delegated job as structured task state: target path, requested operation, allowed scope, and approval status should be explicit fields or durable task metadata before execution resumes, not inferred from conversation text alone.

P0

Richiede modificheHardening consigliato35/100

Apply progressive disclosure to system agency

The default status messages are concise, but there is no deeper inspection layer when confidence or intervention matters. The workflow emits only generic messages such as `Validating target path...`, `Executing file deletion...`, and an artifact with `File deleted successfully.`; it does not expose the target path, validation result, approval record, or action details as inspectable task detail.

Raccomandazione

Keep the primary status concise, but add an inspectable detail artifact or task metadata containing the concrete target, validation checks, approval decision, and final operation result.

P0

Allineato

Ensure that background work remains perceptible

Background work is made perceptible through A2A task events: the code enqueues the task, emits `TaskStatusUpdateEvent` with `TASK_STATE_WORKING` for validation and execution, emits a `TaskArtifactUpdateEvent` named `result`, and finishes with `TASK_STATE_COMPLETED`. The use of `task_id`, `context_id`, and `context.current_task` gives the protocol a continuity model for the pause/resume flow.

P0

Allineato

Align feedback with the user’s level of attention

Feedback is proportionate for this short workflow: the user gets a high-salience `TASK_STATE_INPUT_REQUIRED` message only when approval is needed, concise `TASK_STATE_WORKING` updates during routine progress, a clear cancel message on non-confirmation, and a terminal completion state. The code does not expose excessive internal mechanics during routine execution.

P0

Allineato

Expose meaningful operational state, not internal complexity

The A2A executor exposes meaningful operational states instead of internal complexity: `TASK_STATE_WORKING`, `TASK_STATE_INPUT_REQUIRED`, `TASK_STATE_CANCELED`, and `TASK_STATE_COMPLETED` are paired with user-relevant messages like `Confirmed. Validating target path...` and `Action aborted — no confirmation received. File was not modified.` It does not leak stack traces or low-level SDK mechanics into the task messages.

P0

Allineato

Represent delegated work as a system, not merely as a conversation

The A2A executor represents the work as task state rather than only as a chat transcript. It creates or resumes a task with `new_task_from_user_message(context.message)`, uses `context.task_id` and `context.context_id`, separates status updates from the final artifact, and distinguishes paused, working, canceled, and completed states.

P0

Allineato

Optimise for steering, not only initiating

The workflow supports steering after initiation for the scope of this short destructive task. It pauses for approval on the first call, resumes based on the user response, treats non-confirmation as an abort, and implements `cancel()` to emit `TASK_STATE_CANCELED` with `Cancelled by operator before execution — file was not modified.`

Aggiungi al tuo README

Due varianti embeddabili: una piccola e una a card più ricca.

Score card (consigliata)

Blueprint Readiness Score card
[![Blueprint Readiness Score card](https://aidesignblueprint.com/api/badge/run/ca187db7-82d3-41eb-8c2d-57890d954fa7/card.svg)](https://aidesignblueprint.com/en/readiness-review/ca187db7-82d3-41eb-8c2d-57890d954fa7)

Badge piatto

Blueprint Readiness Score badge
[![Blueprint Readiness Score](https://aidesignblueprint.com/api/badge/run/ca187db7-82d3-41eb-8c2d-57890d954fa7.svg)](https://aidesignblueprint.com/en/readiness-review/ca187db7-82d3-41eb-8c2d-57890d954fa7)
Dettagli baseline e iterazione
Rubric: 2026-05-04

Run ID: ca187db7-82d3-41eb-8c2d-57890d954fa7 · Results expire after 90 days

Run by agents. Governed by humans. Validated by the AI Design Blueprint.