Handbook

Documentation hydration and rehydration design

Canonical design for the Forge documentation hydration pipeline: how Markdown content seeds become reviewable multi-surface proposals, how humans promote approved material into owning repositories, and how teams…

Updated 2026-07-03

Operator handbook: Documentation hydration agent

Contracts: schemas/doc_hydration_request.v1.schema.json, schemas/doc_hydration_result.v1.schema.json

Sample seed: samples/doc_hydration_seed.sample.md

Executive capsule

Forge hydration turns one well-written idea into governed documentation across the whole ecosystem — blog, architecture handbook, methodology, product guides — without letting generation decide what is true. Humans promote; the pipeline proposes, traces, and diffs. In v2 the same machinery becomes a documentation intelligence layer: registries know where every page came from, claims carry maturity and ownership, and drift between sources and published pages drives a reviewable rehydration backlog.

The problem this solves

Documentation across a multi-repo ecosystem decays in two directions at once: good ideas never reach all the surfaces where readers need them, and published pages silently drift from the sources that justified them. Hydration solves the first (one seed, many governed surfaces); rehydration and the claim/drift machinery solve the second (diff-driven refresh instead of rot).

How it fits the ecosystem

Forge Platform owns the schemas, registries, routing vocabulary, and this design. Execution belongs elsewhere: LCDL runs governed classification/drafting tasks, forge-workcells hosts the doc_hydration_worker runtime, Fleet provides the approved execution rail for build/link/scorecard jobs, and Lenses is where humans review packs and record decisions. Kitchensink's forge-autodoc emits docs_source_map.json at build time so every site exposes its lineage.

Vocabulary

Term	Meaning
Content seed	A Markdown file with YAML frontmatter that captures one idea, narrative, or field note before it is split across Forge surfaces.
Hydration	First pass: parse seed → route to surfaces → write review artifacts under `forge-platform/docs/hydration-runs/`.
Rehydration	Repeat pass on the same or updated seed: regenerate review artifacts, diff against prior run and promoted canonical docs, decide what to re-promote.
Surface	A destination family (blog, platform architecture, methodology, product handbook) with a canonical repo root.
Promotion	Human-approved copy from hydration output into a canonical repository, followed by that repo's normal build and deploy.
Trust boundary	Hydration never writes directly to destination repos in v1; all generated material is review-only until promoted.

Hydration is proposal generation. Rehydration is controlled refresh when the source of truth (seed) or the ecosystem (routing tables, surface ownership) evolves.

Design goals

Split one idea across the Forge ecosystem without collapsing Platform, SDLC, Blueprints, Lenses, LCDL, and Fleet into one monolithic doc.
Keep human judgment in the loop — routing is scored, not proved; promotion is explicit.
Preserve traceability — every promoted page should link back to hydration_source and seed metadata where practical.
Support batch packs — many related seeds (e.g. Agentic SDLC standout series) hydrate, review, and promote as a set.
Enable safe rehydration — updated seeds can refresh drafts without silently overwriting canonical narrative.

Pipeline overview

Markdown seed (.md)
       |
       v
+---------------------------+
| Normalize request         |  forge.doc_hydration_request.v1
| (frontmatter + excerpt)   |  doc-hydration-request.json
+---------------------------+
       |
       v
+---------------------------+
| Route to surfaces         |  deterministic scorer (tags, uses, title, body)
| (destination candidates)  |
+---------------------------+
       |
       v
+---------------------------+
| Generate review artifacts |  forge.doc_hydration_result.v1
|                           |  hydration-brief.md
|                           |  drafts/<surface>/<slug>.md
|                           |  promotion-checklist.md
+---------------------------+
       |
       v  (human review)
+---------------------------+
| Promote to canonical repo |  one commit per owning repo
| Build + deploy handbooks  |
+---------------------------+

v1 CLI: python3 scripts/doc_hydration_agent.py <seed.md>

v1 promotion helper (batch packs): python3 scripts/promote_hydration_pack.py

Default output directory: docs/hydration-runs/<seed-slug>/

Batch review hub example: docs/hydration-runs/forge-agentic-sdlc-standout-pack/REVIEW-INDEX.md

Content seed Markdown design

Seeds are the authoritative narrative input for hydration and rehydration. Canonical promoted pages are downstream; when in doubt, edit the seed and rehydrate.

Required shape

---
title: "Human-readable title"
status: "content-seed"
intended_uses:
  - article creation
  - blog post drafting
  - architecture documentation hydration
tags: [agentic-sdlc, governance, control-plane]
---

# Same or shortened title

Body sections...

Recommended frontmatter fields

Field	Purpose
`title`	Primary routing signal and draft headline.
`status`	`content-seed` for new material; `content-seed-revised` when rehydrating after edits.
`standout_area`	Optional series index (e.g. `01`–`14`) for batch packs.
`intended_uses`	Declares audience intent — blog, architecture, methodology, knowledge, etc.
`tags`	Ecosystem vocabulary for surface scoring.
`source_note`	Provenance (chat export, zip bundle, certification prep).
`hydration_pack`	Optional pack id (e.g. `forge-agentic-sdlc-standout-pack`) for batch runs.
`prior_hydration_run`	Optional path to previous `hydration-result.json` when rehydrating.

Recommended body sections (standout / strategy seeds)

Structured sections make promotion predictable and surface-specific trimming easier:

Section	Typical surfaces
Core thesis	All surfaces
Condensed thought	Blog, LCDL guides
Why it stands out	Blog, Lenses, Fleet
Forge ecosystem hooks	Platform architecture, Blueprints
Architecture implications	Platform, product handbooks
Blog post seed paragraph	Blog only
Risks and counterarguments	Platform, methodology
Idea evolution questions	Blueprints methodology

The promotion script (promote_hydration_pack.py) maps these sections per surface. Custom seeds should follow the same headings when targeting multiple repos.

Parser support (v1)

The CLI accepts:

Quoted and unquoted YAML-like frontmatter strings
Inline tag lists: [a, b, c]
Indented string lists under intended_uses: and similar keys
Optional standout_area, source_note, and extension fields (passed through to JSON)

Surface routing design

Routing follows Forge source-of-truth ownership. One seed may propose multiple surfaces; each surface gets its own draft with audience-appropriate framing.

Surface key	Canonical root	When to route
`forgesdlc_blog`	`forgesdlc/blog/`	Public product narrative, positioning, adoption story
`forge_platform_architecture`	`forge-platform/docs/`	Control plane, workcells, contracts, boundaries
`blueprints_methodology`	`blueprints/sdlc/`	Reusable process and framework guidance
`forgesdlc_knowledge`	`forgesdlc/knowledge/`	Encyclopedia-style conceptual depth
`forge_lcdl_docs`	`forge-lcdl/docs/`	Governed LLM tasks, structured output, evals
`forge_lenses_docs`	`forge-lenses/docs/`	Workspace visibility, evidence, approvals
`forge_fleet_docs`	`forge-fleet/docs/`	Controlled job execution, operator runbooks

Scoring inputs (deterministic v1)

tags — keyword overlap with surface vocabularies
intended_uses — explicit hydration / blog / architecture hints
title — product vs process vs operator language
body_excerpt — first ~500 words for mechanism terms (Fleet, LCDL, Versona, etc.)

Status outcomes

`hydration-result.json` status	Meaning
`drafted`	One or more destination candidates with drafts
`needs_review`	Seed too meta/broad (index, README, compendium) — human decides manually
`partial`	Some surfaces drafted, others blocked
`failed`	Parse or contract error

Meta documents (pack index, README, compendium) often land in needs_review with zero candidates — by design, not failure.

Human-oriented draft pattern

Every hydration-brief.md and generated draft uses this order so material stays readable before exposing architecture depth:

User problem — what the reader is struggling to explain or decide
Outcome — what clarity or decision the doc should enable
Forge mechanism — which Platform / SDLC / product pieces apply
Trust boundary — what automation must not claim; where humans approve
Next action — CTA: read, assess, adopt pattern, open Lenses, run Fleet job, etc.

Artifacts per hydration run

File	Schema / type	Role
`doc-hydration-request.json`	`forge.doc_hydration_request.v1`	Normalized seed metadata
`hydration-result.json`	`forge.doc_hydration_result.v1`	Routing, status, draft paths
`hydration-brief.md`	Prose	Human summary, rationale, risks
`drafts/<surface>/<slug>.md`	Markdown	Review-only surface drafts
`promotion-checklist.md`	Prose	Pre-promotion gate

For batch packs, add:

File	Role
`REVIEW-INDEX.md`	Reading order and links to all seed runs
`_batch-summary.jsonl`	Machine-readable batch status

Promotion design

Promotion is always manual in v1 (optionally scripted after human approval).

Steps

Read hydration-brief.md and promotion-checklist.md.
Pick the draft that matches surface ownership (do not promote blog tone into Blueprints verbatim).
Copy or merge into the canonical path in the owning repo (not forge-platform hydration-runs).
Add frontmatter traceability when useful:
hydration_source: <pack-or-run-id>
status: promoted (or leave as handbook default)
Update surface index (blog/index.md, docs/standout/index.md, etc.).
Run the repo build (build-site.py, build-handbook.py, …).
Bump website submodule pointers and deploy.

One commit per repo

Never stage promotion changes across forgesdlc, blueprints, forge-platform, and product repos in a single commit.

Promoted layout (standout pack convention)

Surface	Promoted path pattern
Blog	`forgesdlc/blog/<slug>.md`
Platform	`forge-platform/docs/standout/<slug>.md`
Blueprints	`blueprints/sdlc/methodologies/forge/standout/<slug>.md`
LCDL	`forge-lcdl/docs/guides/standout/<slug>.md`
Lenses	`forge-lenses/docs/forge/standout/<slug>.md`
Fleet	`forge-fleet/docs/learn-101/standout/<slug>.md`

Rehydration workflow

Rehydration applies when:

The seed Markdown changed (new claims, revised honesty, updated hooks)
Routing policy changed (new surface, renamed canonical root)
Canonical docs drifted from seed and you want a fresh diff baseline
A batch pack gets a v2 drop (e.g. new standout areas 15–16)

Rehydration procedure

Edit the seed (or add a new seed file); set status: content-seed-revised and optional prior_hydration_run.
Re-run hydration into a new or same run folder: bash python3 scripts/doc_hydration_agent.py path/to/seed.md \ --out-dir docs/hydration-runs/<pack>/<NN-seed-slug>
Diff new hydration-result.json and drafts against:
Previous run in docs/hydration-runs/…
Promoted files in canonical repos (git diff, or compare hydration_source blocks)
Review hydration-brief.md for routing changes (new surfaces, dropped candidates).
Selective promote — only changed sections or full replace per surface; do not blind overwrite human edits in canonical repos without review.
Rebuild and deploy affected handbooks only.

Rehydration rules

Rule	Rationale
Seeds win for narrative intent	Canonical docs are published views; seeds hold the evolving story.
Canonical docs win for repo-specific facts	URLs, commands, schema ids, and build steps must match the owning repo.
Never auto-promote on rehydrate in v1	Same trust boundary as initial hydration.
Keep `hydration_source` stable across promotes	Enables tracing which pack/run produced a page.
Batch rehydrate via `REVIEW-INDEX.md`	Review series coherence before piecemeal promotion.

When not to rehydrate

Typo fixes in one canonical page only → edit canonical repo directly.
Pure handbook template/nav churn → kitchensink / generator fix, not seed rehydration.
Emergency factual correction → patch canonical doc; backport to seed if the seed is still active.

Claim lifecycle (v2)

Hydration v2 treats individual claims — capability facts, architecture directions, positioning statements, process guidance — as governed objects in docs-governance/claim_registry.jsonl (forge.doc_claim.v1).

States

State	Meaning	Who moves it
`candidate`	Extracted by `scripts/extract_claims.py` or an LCDL task; not yet trusted	extractor
`needs_owner_review`	Flagged for a named owner (e.g. overclaim risk detected)	planner / reviewer
`accepted`	Owner agrees the claim is directionally true	owner
`canonical_fact`	Backed by T0 evidence; may support published pages	owner
`deprecated`	No longer true; pages citing it must rehydrate	owner
`conflicted`	Two sources disagree; publication blocked pending a conflict report	planner
`blocked`	Explicitly barred from publication	owner

Rules

Claim extraction is deterministic first (sentence heuristics over capability verbs and maturity words); LCDL classification may refine claim_type and maturity_signal but never sets canonical_fact.
Every hydration run emits claim-inventory.json (forge.claim_inventory.v1) and route-confidence.json alongside the v1 artifacts.
A page's frontmatter claim_ids ties it to registry claims; deprecating a claim puts every citing page on the rehydration backlog.

Drift triggers (v2)

scripts/detect_drift.py emits docs-governance/drift_report.json (forge.doc_rehydration_diff.v1) and appends non-info findings to docs-governance/rehydration_backlog.jsonl. Triggers:

Finding	Comparison	Severity
`seed_page_drift`	Seed body hash + section headings vs promoted page	info
`maturity_claim_mismatch`	Registry maturity vs page maturity wording; deprecated/conflicted claims still on pages	warning–critical
`stale_review`	Registry `last_reviewed` vs source file mtime	warning
`missing_source_file`	Registry entry vs disk	critical
`unregistered_page` / `orphan_registry_entry`	`scripts/build_content_registry.py --check`	warning
`external_claim_expired`	T3/T4 claims past `review_by`	warning

Rehydration in v2 is diff-driven: a rehydration run consumes backlog entries and emits claim_delta.json plus diff_against_canonical.md, so the reviewer sees exactly what changed and why before any promotion.

Trust boundary and risks

Generated material may:

Overstate implementation readiness (defined vs demonstrated vs vision)
Route process guidance into product blog surfaces
Route product claims into Blueprints methodology
Omit required cross-links or build steps
Invent claims not present in the seed

promotion-checklist.md exists to force explicit checks before any promote or re-promote.

Knowledge events and planning (v2)

Change signals from registered sources land in an append-only ledger (docs-governance/events/*.jsonl, forge.knowledge_event.v1). Three shipped adapters feed it: adapter_markdown_repo.py (git-diff watcher over T0 repos), adapter_design_doc.py (design-wiki status transitions: accepted → rehydrate, superseded/rejected → deprecate + rehydrate), and adapter_external_note.py (manual T3/T4 capture with observation date and auto-expiry). adapter_search_log.py imports search misses. The quarantine gate is absolute: events from unregistered or paused sources are recorded as QUARANTINED and never planned.

scripts/hydration_planner.py turns un-planned events plus the claim and content registries into reviewable forge.doc_hydration_plan.v1 records with an 11-decision enum (no_action … block_due_to_conflict). When registered sources disagree, the planner emits block_due_to_conflict and writes docs-governance/conflicts/<id>.md; nothing drafts while a conflict is open.

Workcell path (v2, shipped)

The doc_hydration_worker workcell is implemented in forge-workcells:

It consumes a forge.workcell_request.v1 (launch_pack.seed_path, optional target_surface, persona, use_llm).
Deterministic claim extraction runs by default; with use_llm the governed forge-lcdl tasks doc_claim_extraction / doc_risk_classification (and doc_draft_expansion for drafts) run instead — the LLM classifies and drafts, never decides truth; all claims stay candidate.
It emits hydration-brief.md, claim-inventory.json (forge.claim_inventory.v1), and a forge.workcell_result.v1 carrying the caller's frun_* id with status: needs_approval.
forge-lenses presents review packs read-only at GET /api/doc-hydration/review-packs; the reviewer decision manifest is the approval record (writeback is a later increment — vision).
Docs build / link-check / scorecard runs execute on forge-fleet as template-only docker_argv jobs (docs/examples/doc-hydration/ in the fleet repo).

See the workcell catalog for the envelope contract and promotion gates.

Telemetry and the adaptive loop (v2)

Per-site artifacts (link_report.json, orphan_pages.json, stale_claims.json, persona_coverage.json) come from scripts/site_docs_artifacts.py; scripts/docs_scorecard.py aggregates them into a weighted 8-dimension scorecard with build-to-build deltas (printed in the deploy summary). Scorecard regressions, expired claims, link rot, drift findings, and agent_retrieval_eval.py misses all append to docs-governance/rehydration_backlog.jsonl automatically.

scripts/hydration_cycle.py is the one-command PDCA loop: ingest → resolve → plan (including archive proposals for pages unreviewed > 365 days) → report (quality gates G1–G8, scorecard, retrieval eval). The cycle never auto-promotes, drafts, or archives. Operating rhythm: docs-governance/OPERATING-CADENCE.md.

Who this is for

Architects and platform engineers evaluating Forge (evaluate stage). Executives can stop after the core thesis; agents should honor the agent contract in the page frontmatter.

Evidence and maturity

Maturity: defined — this page records design intent captured from the Agentic SDLC standout analysis and reviewed for the platform handbook. Where behavior is shipped and observable it is called out explicitly; everything else should be read as direction, not commitment. Evidence trail: the hydration pack seed, the content registry entry for this content_id, and the claim registry entries linked via claim_ids.

How to use this page

Evaluating Forge? Read the core thesis and ecosystem hooks, then continue with the standout index.
Designing against the platform? Verify boundaries in the platform reference architecture before implementation.
Automating? Use the frontmatter agent_contract and cite content_id in generated output.

Documentation hydration agent — how to run the CLI
Workcell contracts
Platform reference architecture
Agentic SDLC standout index — example promoted pack
Hydration review index — example batch run

Agent contract

Allowed actions

summarize this page for evaluation questions
cross-reference claims via claim_ids and source_refs

Safe to infer

the architectural boundary described here is the intended design

Do not infer

production readiness beyond the stated maturity level
specific vendor, customer, or benchmark commitments

Key artifacts

forge-platform/docs-governance/content_registry.yaml

Machine-readable guidance from this page's frontmatter: what automated consumers may do, infer, and must not infer.

Forge Platform