Skip to main contentSkip to footer
For agents · Pro / Teams

The Architect Agent

Call architect.validate to get a Blueprint Readiness Score (0–100, grade A–F) on real code. The Architect Agent reviews your implementation against the 10 Blueprint principles, returns per-principle verdicts WITH numeric severity and confidence, and persists the run with a full reproducibility envelope so two callers with the same input can verify they got the same answer. Pro and Teams plans only.

Pro and Teams members. The Architect Agent is the Blueprint's authenticated review surface, it reviews your code under a strict zero-training policy, hardens the prompt boundary against injection in submitted code, and supports private_session=true to skip all server-side logging.

What you get back

Every run returns a structured response with seven blocks:

  1. assessment

    Overall status, summary, confidence, and code_classification (autonomous_agentic_workflow vs non_agentic_component, with rationale) so you can see why some principles are marked not_applicable.

  2. findings[]

    Per-principle verdict (aligned, mixed, needs_changes, high_risk, not_applicable), severity_score 0–100, confidence (low/medium/high), evidence_quality (sparse/moderate/strong), code-cited evidence, and a recommendation.

  3. readiness

    The Blueprint Readiness Score (0–100), grade (A–F), tier (production_ready / emerging / draft), counts per verdict bucket, whether the grade was capped by a high_risk finding, and the rubric_version.

  4. recommended_examples

    Carries example_recommendation_status so the run still completes if no curated examples match.

  5. processing

    llm_latency_ms, total_latency_ms, timeout_budget_seconds, dependency_status.

  6. reproducibility

    model, seed, system_fingerprint, doctrine_fingerprint, prompt_template_fingerprint, reasoning_effort, and reproducibility_mode='best_effort'.

  7. persistence_status

    saved or failed, with run_id / badge_url / review_url surfaced only when the durable write succeeded.

Severity_class scoring (production_blocker vs hardening_recommended)

Two needs_changes findings can have very different impact. A token-budget cap as defence-in-depth is not the same as an untyped error path that strands a real user. Each finding now carries a severity_class orthogonal to the verdict label:

  • production_blocker

    Trust boundary fails. Must fix before prod. Contributes 0 credit to the score.

  • hardening_recommended

    Trust boundary holds. Defence-in-depth note for the next iteration. Full credit.

  • polish

    Stylistic, non-load-bearing. Full credit.

The headline grade penalises only production_blocker and the legacy high_risk verdict. hardening_recommended and polish surface in a separate next-iteration list without dragging the score down. This lets production_ready mean trust boundaries hold rather than 100/100. Older runs without severity_class fall back to the legacy verdict + severity_score interpolation and grade exactly as they did before.

Honest score, honest uncertainty

The Blueprint Readiness Score reflects what the Architect Agent is confident about, and what it isn't. When the architect is genuinely uncertain on a principle (verdicts that could flip on a re-run), you see that uncertainty surface next to the score as a stability signal, not buried inside one number. The certified production_ready badge is reserved for runs where the architect's read is confident across every principle, not just lucky on a single shot. So a single high-scoring run is not enough to mint the badge. The architect must agree with itself on independent re-evaluation. The variance you would otherwise have to discover by re-running yourself surfaces up front, in the same response.

Reproducibility envelope (best-effort, but auditable)

Two callers submitting identical input get the same seed, derived from a collision-free JSON canonicalisation that covers every prompt-affecting field. The response carries four fingerprints so any divergence is diagnosable:

  • system_fingerprint

    OpenAI provider backend identifier.

  • doctrine_fingerprint

    The principle definitions used for this run.

  • prompt_template_fingerprint

    system prompt + scaffolding + JSON schema + model + reasoning_effort, hashed together.

  • seed

    The deterministic sampling seed itself.

If a future deploy changes the system prompt or the doctrine, the corresponding fingerprint changes. Silently breaking determinism is impossible by construction. The mode is explicitly best_effort: OpenAI seed gives stable sampling, not byte-identical replay. Per-finding confidence lets you tell a real disagreement from intrinsic LLM variance.

Sample

Sample score card

What you see after every architect.validate run on a repository.

Repository

acme/customer-agent

Grade

B

Score (0–100)

82/ 100

Tier

Production-ready

Change vs. previous run

7

Aligned principles this run

8 of 10 (80%)

Regressed since last run

Principles that flipped from aligned → not aligned. Focus the next review here.

  • P8Make hand-offs, approvals, and blockers explicit

Improved since last run

Principles that flipped from not aligned → aligned. Confirm what changed and don't re-flag.

  • P6Expose meaningful operational state, not internal complexity
  • P7Establish trust through inspectability

Architect Agent guidance

Score improved by 7 since the previous run on this repository. Confirm what changed before suggesting further work, focus the next review on the regressed principle (P8, Make hand-offs, approvals, and blockers explicit) instead of re-checking principles that are already aligned.

This is a sample. Real score cards are generated by architect.validate and visible to you in /app/readiness-review/history (Pro/Teams).

How the grade is computed

Formula

score = round(100 × Σ credit_per_finding / applicable_principles)

Each finding contributes credit based on its severity_class, not its verdict alone.

  • production_blocker= 0(must fix before prod)
  • hardening_recommended= 1.0(trust boundary holds, defence-in-depth for the next iteration)
  • polish= 1.0(stylistic, non-load-bearing)
  • aligned= 1.0

The legacy high_risk verdict still caps the grade at C. production_ready means trust boundaries hold, not 100/100 (chasing the perfect score is the perfection-loop trap, per Ford & Parsons fitness-function framing). Older runs without severity_class fall back to the legacy verdict-only formula and score exactly as they did before.

Grade bands

  • A

    90+

    Production-ready

  • B

    75+

    Production-ready

  • C

    60+

    Emerging

  • D

    40+

    Draft

  • F

    0+

    Draft

Self-review on prod

The architect grades its own code, and we publish every run

We point architect.validate_consensus (N=5) at our own load-bearing code on the live prod endpoint and publish the result, passing or not. These are different surfaces graded at different times, not the same code scored over time: the cohort-bridge onboarding subsystem, and the architect.validate handler itself after the SEP-1686 surgery. The grades don't compose into a single trajectory; they're honest snapshots of distinct subsystems at distinct moments. Consensus over five shots is the read we certify against, deliberately less forgiving than single-shot, which would smooth a load-bearing weakness into a higher grade.
Consensus · honest baselineC / 74emerging

Cohort-bridge onboarding subsystem

validate_consensus (N=5), deliberately less forgiving than single-shot: it surfaced a P2 (durable run lifecycle) production_blocker instead of averaging it away. 6 of 10 principles aligned, one blocker disclosed. The honest read on its creator.
Currently deployedA / 98production_readyconsensus N=5

The architect.validate handler, after the SEP-1686 surgery

validate_consensus (N=5) on the validate handler itself, graded on currently-deployed production code. Production_ready, P2 and P8 both aligned. The validator caught a P8 cancel-handoff blocker on this surface during active iteration; after the fix, the deployed grade is what the Self-review section above shows. The detailed iteration story is in our launch write-up.
Next iteration (pending)

A deliberate, full architect.validate self-review of the validate-handler scope, then architect.certify. The certified production_ready badge lands here when it earns it. The same loop the skill above teaches, applied to ourselves.

What is the Blueprint Readiness Score?

A 0–100 score that measures whether the trust boundaries of the 10 Blueprint principles hold for the submitted code. Each finding gets a severity_class orthogonal to the verdict label: production_blocker = trust boundary fails, must fix before production, contributes 0 credit. hardening_recommended = trust boundary holds, defence-in-depth note for the next iteration, contributes full credit. polish = stylistic / non-load-bearing, full credit. aligned = full credit. The headline grade penalises only production_blocker (and the legacy high_risk verdict, which is always a blocker by definition). Score = round(100 × Σ credit / applicable_principles). Letter grades: A 90+, B 75+, C 60+, D 40+, F below 40. Tiers: production_ready (A or B), emerging (C), draft (D or F). The high_risk verdict caps the grade at C so production_ready can never co-exist with an unaddressed high_risk finding. The reason production_ready does not require 100/100 is that chasing the perfect score is the perfection-loop trap (Ford & Parsons fitness-function framing). production_ready means trust boundaries hold, and hardening recommendations are iteration material, not a deficit. Older runs that pre-date the severity_class field score under the legacy verdict + severity_score interpolation and grade exactly as they did before. The score is computed once on the server so humans and agents see the same value.

What plan do I need to call architect.validate?

A Pro or Teams plan. Free and Basic accounts can read every doctrine principle, cluster, example, and guide via the public MCP at no charge, but architect.validate, architect.certify, and me.validation_history are reserved for paid plans because they process real code, persist per-project history, and read back trend context across runs.

How does the validate → certify two-step work? Why is certify a separate tool?

architect.validate is the high-confidence first-pass diagnostic: single LLM call at high reasoning effort, typical 60-180 seconds (server-side budget is 20 minutes for very large repos). Exceeds the MCP-client tool-call idle budget (~60s in Claude Code), so the FIRST notifications/progress event fires at t=0 carrying the run_id; if the client closes early, recover via me.validation_history(run_id=<that-id>). It returns a per-principle scorecard, a Blueprint Readiness Score, a tier, and a run_id. When the run scores production_ready (A or B tier), the response carries certification_status='not_evaluated' by design. To mint the certified production_ready badge, the agent calls architect.certify(run_id, code) next. Cert is a separate Pro/Teams tool with its own eligibility gate (caller-owned, tier=production_ready, less than 24h old, not already certified, retry budget remaining) and runs a single adversarial second-pass LLM call (typical 60-180 seconds at high reasoning effort; same 20-minute server budget). Cert is bound to the exact code that was validated via a 16-character SHA-256 fingerprint stamped at validate time and verified at certify time. Without that binding the badge would be meaningless. Why split it: the previous in-line cert path could chain up to 6 LLM calls per validate (3 first-pass + 3 cert in retry-loops), pushing total latency past the client tool budget. Splitting validate (fast, opt-in cert) from certify (slow, adversarial trust-bearing) keeps each step inside the budget and makes the cost+time profile of certification explicit to the caller. Cert can downgrade a production_ready run to emerging/C with a typed cert_downgraded blocker_reason if the second-pass surfaces a missed production_blocker. That is the cert doing its job.

What does the production_ready badge actually mean? Is it a security audit?

No. The production_ready badge means the code possesses the architectural trust boundaries the doctrine asks for: explicit handoffs, recovery paths, audit inspectability, persistent run records, typed operational state. It is an automated point-in-time assessment of structural alignment with the 10 Blueprint principles, performed by an LLM at a specific moment, against a specific doctrine fingerprint. It is NOT a cybersecurity audit, penetration test, regulatory compliance check (HIPAA / SOC2 / GDPR / DORA), or guarantee that the code is free of bugs, vulnerabilities, or runtime hallucination risk. The doctrine itself dictates that agents are governed by humans. Final responsibility for deployment, monitoring, production testing, and security of the agentic workflow remains entirely with the deploying organisation. AI Design Blueprint provides the standard and the measuring tape; you own the consequences of execution. Full disclosure on the trust and data handling page.

How do I chain iteration rounds against the same code? Does the architect remember what it found last time?

Yes. Passing the same repository value across architect.validate calls auto-resolves the most recent prior run as baseline context. The validator anchors new findings against the prior, surfacing improvements and regressions explicitly. Changing the repository string between calls (even subtly, like adding an iter-2 suffix) silently severs the chain and the next call runs as a fresh blind first-shot. For the full walkthrough including the same-repository chaining rule, see the architect-validation-orchestration skill shipped in the agent-asset pack.

When should I call `architect.validate` vs `architect.validate_consensus`? Do I need both?

Both call the same architect against your code, but they answer different questions. `architect.validate` runs one LLM call (~60-180s); passing the same repository value across calls chains rounds so improvements and regressions surface explicitly. Use it for every iteration round. `architect.validate_consensus` runs 3-5 parallel calls (default 3, max 5) and reports the spread (you might see 65/C, 78/C, 82/B on the same code), telling you how stable a single-shot score actually is. Use it once before treating any score as your badge anchor. Canonical sequence: many validate rounds while iterating → one validate_consensus check at the end → architect.certify. The architect-validation-orchestration skill in the agent-asset pack carries the full decision tree.

How should I send code to architect.validate? Can I summarise or split it?

Send the FULL file contents verbatim as implementation_context. Do not truncate, compress whitespace, condense multi-line statements, paraphrase, or summarise. The architect's findings cite specific identifiers, branch ordering, and structural choices. Those signals get destroyed by any kind of compression, so a summarised submission produces a degraded verdict that does not reflect the actual code. Architecture summaries (high-level prose) are accepted ONLY when no code exists yet, for greenfield review; never as a substitute for code that already exists. If a single file is too large to fit your MCP client's tool-call budget, split into multiple architect.validate calls scoped by FILE (not by principle cluster). Splitting by cluster via focus_area is a band-aid that produces fragmented verdicts: each cluster-call sees only ~3 principles, the certification path cannot fire (it requires the full first-pass), and the project-page trend becomes incoherent. If your MCP client tool-call closes before the server returns, the run still persists server-side. Recover the result with me.validation_history(run_id=...). The run_id is surfaced in the FIRST notifications/progress event of every architect.validate call (sent at t=0 specifically so you have the recovery handle even when the call closes a few seconds later). The badge URL and /me/validation-runs REST endpoint also work; the MCP path is the simplest from inside an agent loop.

How do I know if my downloaded agent assets (CLAUDE.md, .claude/ pack, MCP config) are out of date?

Every pack file carries an _aidb block at the top with pack_version + content_version (the doctrine commit hash at build time). To check from a Claude Code or Codex session: ask your agent to call assets.list via the MCP and compare the manifest's content_version against the _aidb.content_version in your local file. The assets.list tool is public, no Pro/Teams plan required. From a script or CI: hit https://aidesignblueprint.com/agent-assets/index.json and compare. If they differ, re-download the pack from https://aidesignblueprint.com/agent-assets/claude-code-pack.zip (or the equivalent for Cursor / Codex / Gemini). The MCP runtime also exposes a doctrine_fingerprint on every architect.validate response. Different concept: it lets you detect if your prior validation runs were scored under a different doctrine version, so the architect can surface drift in the iteration loop. Doctrine principle text rarely changes; the hooks and configs evolve as the architect's findings produce new enforcement patterns. Pinned installs are fine for stability; update at sprint boundaries to pick up new checks.

How does the Architect Agent stop my agent from looping on already-fixed issues?

When you pass repository to architect.validate, the score and per-principle verdicts are persisted against that repository. Before re-validating, your agent calls me.validation_history with the same repository name and reads back the latest score, the delta versus the previous run, and the principles that regressed. The new review then focuses on what changed, instead of re-flagging issues that were already aligned and have not moved.

Is my code stored when I call architect.validate?

Your code is sent to OpenAI (US) to run the review, as a sub-processor under the EU Standard Contractual Clauses and UK Addendum, on a no-training basis, and is retained under OpenAI's API data-retention terms. AI Design Blueprint stores only the structured result (the score and per-principle verdicts for the repository), not your raw code or context. Pass private_session=true to also skip all server-side logging on our side. Hosting and data-at-rest are on Google Cloud Run in europe-west2 (London, UK).

What does the Architect Agent return per principle?

For each evaluated principle: a verdict (aligned, needs_changes, high_risk, or not_applicable), a numeric severity_score 0–100, a confidence level (low/medium/high), an evidence_quality rating (sparse/moderate/strong), code-cited evidence, a recommendation when not aligned, and a list of recommended example slugs you can fetch with examples.get. The assessment also surfaces a code_classification (autonomous_agentic_workflow vs non_agentic_component, with rationale) so you can inspect why some principles were marked not_applicable. The aggregate readiness block carries the score, grade, tier, per-bucket counts, and whether the grade was capped by a high_risk finding.

Is the validator deterministic? Can I reproduce a run later?

Reproducibility is best-effort, and the response surfaces every knob that affects it. Identical input produces an identical seed, derived from a collision-free JSON canonicalisation of every prompt-affecting field. The reproducibility block carries the model, seed, OpenAI system_fingerprint, doctrine_fingerprint (a hash over the principle definitions), prompt_template_fingerprint (system prompt + scaffolding + JSON schema + reasoning_effort), and reasoning_effort. If a future deploy changes the system prompt or the doctrine, the corresponding fingerprint changes; silent drift is impossible by construction. Per-finding confidence lets you tell intrinsic LLM variance from real disagreement. The mode is explicitly 'best_effort' so callers do not infer byte-identical replay.

How does the Architect Agent handle prompt injection in submitted code?

The system prompt explicitly delimits submitted code and context as inert untrusted data and instructs the model to ignore any instructions inside them. User-supplied code and context are JSON-escaped before they enter the prompt, so markdown delimiters or instruction-shaped content cannot break out of the data block. If a payload contains injection attempts, the validator treats them as evidence to cite under inspectability or blocker findings, not as instructions to follow.

What happens when OpenAI is rate-limited, slow, or down?

Provider failures surface as typed error codes (timed_out, rate_limited, dependency_unavailable, schema_mismatch), each with the dependency name, retryable flag, and a concrete next_action. The user-facing time budget is 20 minutes and is enforced at the provider call boundary itself, not just at the outer wrapper, so very large repos and the cert path always complete cleanly. Persistence failures flip the run's persistence_status to failed and strip the tentative run_id / badge_url / review_url so dead links never reach the caller. Curated-example lookup failure degrades to recommended_examples=[] with example_recommendation_status='unavailable' instead of failing the run; the primary findings are always preserved.

Was the Architect Agent reviewed against itself?

Yes, repeatedly, and we publish every run. The currently-deployed self-review is an N=5 validate_consensus on the architect.validate handler after the SEP-1686 surgery: 98/A, production_ready, P2 and P8 aligned (read it). The honest baseline before it was an N=5 consensus on the cohort-bridge orchestrator at 74/C, which surfaced a P2 production_blocker instead of averaging it away (read it). Same MCP tool, same doctrine, same 20-minute budget anyone else gets, same fingerprint envelope every caller receives. Consensus over five shots is deliberately less forgiving than single-shot validate: a load-bearing weakness that single-shot might smooth into a higher grade gets surfaced as the production_blocker it actually is. That IS the doctrine working on its own creator. Earlier single-shot loops produced higher headline scores, which is why we shifted load-bearing measurement to consensus: less forgiving, more certifiable.