Engineering

How Ready Solutions AI Builds

The agentic Claude Code pipeline that researches, writes, validates, and ships every page on this site, described at the level of what it guarantees, not how it is implemented.

AI-assisted writing has a small number of well-known failure modes that are recoverable as one-offs and structurally catastrophic at corpus volume. A single drafted post can ship any of them silently; the longer the corpus runs, the harder they get to catch. Prose-only quality control does not hold past roughly twenty posts. The fix is architectural, not editorial.

What follows describes a pipeline at the level of what each layer guarantees, never the recipe that implements it. I will not ask you to treat this page passing its own checks as evidence the system works. The evidence sits elsewhere and is checkable without me; §Verify says where.

blog posts

read-only subagents

deterministic validators

glossary terms

cornerstone guides

knowledge-base docs

1,600

lines / eng / day delivery

First-party measure from inside one engagement, not an externally audited benchmark. The honest version of a proof point names its own limits.

The Pipeline

Each layer answers a problem the layer beside it cannot. The order below mirrors lifetime and write authority: long-lived state at the base, deterministic checks above it, the synchronous boundary above those, judgment subagents above the boundary, the single writer at the top, and provenance recorded after the write. The pipeline narrows: many readers, one writer.

Knowledge base

The pipeline's long-lived memory: corpus index, metric registry, source-credibility scheme, audit log.

Continuity. A number, claim, or source that one page established is visible to the next, so the corpus stays internally consistent instead of drifting page by page.

Prevents: Per-page drift; contradictions across the corpus.

Deterministic validators

Scripts that re-check what a model cannot be trusted to get right every time: prose rules, claim canonicalization, numeric reproduction, cross-source contradiction, quote integrity.

Reproducibility. The same input yields the same verdict with no model in the loop, and the count is self-auditing: a build step tallies the inventory under a documented inclusion rule and fails the build if the number printed on this page diverges from the files it counts.

Prevents: Fabricated stats; quote drift; cross-source contradiction.

Pre-write hook

A synchronous veto that fires before any write reaches disk.

A forbidden pattern cannot land even by accident. This is the layer where prose instruction stops being trustworthy. A rule that has to hold every time, with no exception, is enforced at the boundary rather than left to the author to remember.

Prevents: Em-dashes, AI-tell tokens, banned phrases landing in committed prose.

Read-only specialist subagents

Bounded workers, each scoped to one kind of judgment: topic research, counter-evidence, fact-checking, cross-reference, brand voice, coherence.

Containment. A subagent investigates and returns structured findings, but it cannot change the artifact, so authority never propagates out to a spoke.

Prevents: Subagent overreach; uncontained errors from one judgment lane affecting another.

Orchestrating skill

The parent that routes the phases, presents the human review gates, and owns every disk write.

Single writer. There is exactly one writer, so no read-only subagent can mutate the artifact, and write-side concerns live in one place rather than scattered across the spokes.

Prevents: Scattered write logic; uncoordinated mutations.

Provenance trust chain

A machine-readable record stamped on each artifact: model version, skill version, knowledge-base SHA, validator pass log, visual-QA sub-block.

Replay. A set of deny-by-default gates where a claim must trace to a source, the disclosure stays at shape level, and the provenance block is present, or the artifact does not publish.

Prevents: Unaudited artifacts; non-replayable runs.

What the provenance trust chain records

Recording where an artifact came from is a settled idea in other domains: the W3C PROV family standardizes agentic pipeline provenance generically, the NIST AI 600-1 Generative AI Profile addresses content provenance for generative systems, and PROV-AGENT formalizes provenance for agentic workflows specifically. The point of the stamp is agentic work gated by deterministic validators rather than ad-hoc review, narrowed to billable and production agentic delivery, where the AI authoring trust chain relies on deny-by-default gates rather than advisory review.

The Practice

Read-only spokes, one writer.

A subagent investigates and returns structured findings, but it cannot change the artifact, so authority never propagates out to a spoke.

Specialist subagents (see also subagent-orchestration) provide independent coverage per judgment lane; InfoQ's editorial coverage of Claude Code subagents confirms independent practitioners adopt the same read-only pattern. The pre-write hook is the synchronous veto that enforces prose rules at the boundary, not in an advisory prompt that a confident-looking context can override.

How a change gets shipped

Brainstorm. Intent + 2-3 approaches explored before any code is written.
Spec. Approved design, committed to docs/, before tasks are decomposed.
Writing-plans. Bite-sized tasks with explicit files-to-touch + tests, sized for a subagent with zero context.
Subagent-driven dev. Fresh implementer subagent per task, followed by spec-compliance review and code-quality review.
Finish branch. PR open, CI gate green, squash-merge, branch cleanup.

Each phase loads a domain procedure as an on-demand skill (Anthropic docs) only when a task calls for it, instead of holding every procedure in one always-on prompt. On-demand discipline keeps the active context narrow and the procedure library wide.

Session-end retros write structured improvement entries to a backlog. Recurring failure modes earn a registered key and accumulate evidence under one entry; a key with enough recurrence promotes to a structural fix. The loop is the practice that keeps the practice improving.

See also: claude-code-hook (the term covers the capability; this section claims the decision between capabilities).

The Catch-Net

Most failure modes are caught at two layers rather than one, so a single missed flag at the judgment layer does not ship the page.

Eleven failure modes by catch layer. Most modes are caught at two or more layers.

37 validators in scripts/, generated from data/validator-inventory.json at every build. The inclusion rule, quoted from the inventory file: scripts/validate-*.mjs + scripts/audit-*.mjs + scripts/monitor-*.mjs, excluding *.test.mjs. The figure of record is the one regenerated from the codebase inventory at every build, not a number typed into prose, which is why a stale count can't quietly persist here.

Authority Engine

I monitor my own self-referential-edge ratio: live R = 16.10 versus threshold 40. The entity graph is self-asserted, and naming that bounds the trust claim; §Verify says where the boundary sits.

Conflating these axes creates the breadcrumb bug. The entity graph is hub-and-spoke; the URL hierarchy is flat; lateral cross-links carry discovery, never parental authority.

GEO/SEO citability

Every glossary term in the rebuild track carries a load-bearing OPINION in its dek and a trade-off claim in its whyItMatters: the canonical Anthropic / W3C / NIST docs leave the trade-off niche open, and the rebuild claims it explicitly. An S2-6 probe confirms a retrieval LLM extracts the rebuild's framing as distinct from the upstream doc.

Of 11 terms, 4 have been rebuilt to this shape. Entity @ids follow the Model Context Protocol architecture and Schema.org conventions; the graph is self-asserted but machine-readable.

The Proof: the blog-post engine

This section names what is in production today: the structural shape, the counts, the gates, the provenance receipt. The full design rationale, the principles that drove the design, and the lessons that travel beyond this corpus live in the case study.

The architecture is the artifact: the blog output is incidental; what travels are the lessons. From the case study at /case-studies/2026-05-08-ai-authoring-pipeline/.

Phases are temporal flow; layers (per the case study) are architectural stack: orthogonal axes, same system.

Phase 1

Research

Every H2 has bound Tier-1/2 sources before drafting.

Catches: Thesis built on confirming evidence only (counter-evidence as a first-class lane).

Phase 2

Outline

Angle, thesis, audience tier, and source binding are locked before a sentence is drafted.

Catches: Unsourced H2 gaps surface at QG1.

Phase 3

Draft + Phase 3b visual

Claims trace to bound sources; visuals reviewed before publish.

Catches: Zero-em-dash + banned tells caught at the pre-write hook synchronously.

Phase 4

Validate

Every hard claim is verified at two independent layers.

Catches: Cross-source contradiction, citation gaps, AI-tell density, thesis-spine drift.

Phase 5

Publish

Every post carries its replayable validation receipt.

Catches: Drift between what was validated and what shipped.

QG1 / QG2 + report-and-proceed

QG1 is the research sufficiency check between Phase 1 and Phase 2: flags unsourced outline gaps, undersourced H2s, missing experiential anchors. QG2 is the validation reconciliation between Phase 4 and Phase 5: synthesizes all five validator outputs into a Severity-2 Disposition Table where every Sev-2 routes to addressed-by-rewrite, fixed-directly, or deferred. Silence is a defect. In autonomous mode the gates are report-and-proceed: BLOCKING findings always surface; non-blocking deferred entries land in the QG2 report for review after publish.

The provenance receipt

provenance:
  model_version: <Which model and revision wrote the post.>
  skill_version: <Git SHA of the blog-post skill at publish time.>
  kb_sha: <Git SHA of the knowledge base at publish time.>
  validation_pass_log: <Path to the per-post coverage telemetry JSON.>
  published_at: <ISO-8601 UTC publish timestamp.>
  visual_qa: <Sub-block with status, reviewed, unresolved_defects from the Phase 3b loop.>

Replay is structural, not narrative. Future-you can reproduce, rerun, or invalidate any post deterministically.

The pipeline runs five research lanes plus five validation lanes per post: a 5+5 shape. The blog corpus is 99 posts today, all gate-green per the required CI check. For the trade-offs of stage-gated production agentic delivery at scale, see the AWS Prescriptive Guidance on operationalizing agentic AI.

See also: subagent-orchestration (the term covers context isolation per worker; this section covers how a specific composition produces a publishable post).

Verify

Authority on a page like this goes only as far as an outsider can check it.

The verifiability boundary. Two columns are mutually exclusive, not overlapping: an outsider can check the left; the right rests on internal discipline.

Regenerating those numbers at every build is what keeps them honest on my end; from the outside, with the repository private, you still take the count and the stamp on faith.

One thing this section does not claim is that this page passing its own validation proves the pipeline works. That would be circular. The pipeline runs against this page the same way it runs against every other; the standing proof is the documented architecture and the published outcomes, not a green checkmark this page awards itself.

Publishing the shape is a transparency choice, not a claim that the method is secret or that being open changes what the pipeline does. It changes what you can see, and what you can check.

Common questions?

Does AI write the content on this site?

Yes, end to end, with a human review gate. Read-only specialist subagents do the judgment work, and one orchestrating skill performs every write, so no subagent changes a file on its own.

How is AI-authored content kept accurate?

Layered defense: canonical facts live in a knowledge base, deterministic validators re-check them, a fact-checking pass walks every cited claim back to its source, and a provenance stamp records the run. A claim that cannot be substantiated does not ship.

Can I see the pipeline source code?

No. The architecture is public; the recipe is not. Validator logic, agent instructions, and knowledge-base contents stay private, which is enough to confirm the design is real and not enough to clone it.

What can an outsider verify independently?

The case study, the engineering trust paper, and the structured data embedded in this page source. The validator counts and provenance stamps are internal, since the repository is private, so those you take on faith.

Is Ready Solutions AI affiliated with Anthropic?

No. Ready Solutions AI is an independent consultancy, not partnered with, affiliated with, or endorsed by Anthropic. It works in Claude, Claude Code, and the Model Context Protocol as an independent practitioner.

Sources

Tier	Source	URL	Published
1	Anthropic, Claude Code subagents documentation	code.claude.com/docs/en/sub-agents	continuous
2	InfoQ, Claude Code subagents (independent editorial)	www.infoq.com/news/2025/08/claude-code-subagents/	2025-08-19
1	Anthropic, Claude Code skills documentation	code.claude.com/docs/en/skills	continuous
1	Anthropic, Claude Code hooks documentation	code.claude.com/docs/en/hooks	continuous
1	W3C, PROV overview	www.w3.org/TR/prov-overview/	2013-04-30
1	NIST, AI 600-1 Generative AI Profile	nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf	2024-07-26
1	PROV-AGENT, provenance for agentic workflows (arXiv 2508.02866)	arxiv.org/abs/2508.02866	2025-08-04
1	Model Context Protocol, architecture	modelcontextprotocol.io/docs/learn/architecture	continuous
2	AWS Prescriptive Guidance, software delivery with agentic AI	docs.aws.amazon.com/prescriptive-guidance/latest/strategy-operationalizing-agentic-ai/software-delivery.html	continuous

Read the pipeline case study and the engineering trust paper.

Want to talk about how this applies to your team?

Book a Free Intro Call

Not ready for a call? Take the free AI Readiness Assessment instead.