<!-- linear: KLAIR-2711, KLAIR-2712, KLAIR-2713, KLAIR-2714, KLAIR-2715 -->
## Summary
- Rewrites the 5 typed narrative-emitting Board Doc generators (Prior Quarter Review, GM Commentary, Product Detail, Other Products, CF Plan) as LLM-first under a shared SectionRefreshContext contract.
- Drops the deterministic spec.user_commentary / spec.product_commentary reads that were dead in the 4.0 entry path and forced everything through the generate_custom_section fallback.
- Round-2 review-fix pass (874adc6a9): exception-handling contract lifted to the dispatcher (retry loop + retry-exhaustion preservation + typed retryable set), prompt-fragment dedup to prompts.py, narrative-subsection parser exact-match fix.
## Why it's needed
The 4.0 entry path (BU + quarter + doc → brainlift → DocumentEditor) collapsed the 10-step wizard into 2 screens. Every wizard step that authored content into spec.user_commentary / spec.bu_mips (evaluate_prior_goals, current_quarter_goals, gm_commentary phase, per-product commentary, BU MIPs review) is now unreachable from the 4.0 FE.
But 5 of the 9 narrative-emitting generators still read those fields as their primary content source. They return empty strings for every 4.0 session — both fresh-from-template AND clone-from-prior. The B8.2 LLM-fallback in _regenerate_section papers over this for sections the user actively regenerates, but doesn't fix:
- First-publish path: initial generate_all_sections run produces empty narrative sections that the user has to regenerate by hand.
- Mixed-bucket generators (generate_product_detail, generate_cf_plan): table halves render, narrative halves silently empty.
- Architectural debt: two inert content-source fields linger in DocumentSpec, and every "fix" we add without the architectural cut is another safety net layered on dead code.
See klair-api/budget_bot/board_doc/BACKLOG.md § B9 for the full audit, migration plan, and per-generator scope.
## Changes
Foundation (new in section_generators.py):
- SectionRefreshContext NamedTuple — current_content / findings_block / full_doc_block, all defaulting to "".
- 4 section-specific system prompts (_PQR_SYSTEM, _GM_COMMENTARY_SYSTEM, _PRODUCT_NARRATIVE_SYSTEM, _MINOR_NARRATIVE_SYSTEM) sharing the D2.7 "organized skepticism applied politely" disposition framing; CF Plan keeps its existing prompts.CF_PLAN_SYSTEM.
- B9_OUTPUT_RULES + B9_SCOPE_DISCIPLINE shared prompt fragments in prompts.py (single source of truth across all 5 B9 generators including CF_PLAN_SYSTEM).
- _build_b9_user_message shared user-message assembler with canonical block ordering (header → brainlift → data → existing content → findings → full doc → user focus).
Generators rewritten LLM-first (5):
| Generator | Scope | Linear |
|---|---|---|
| generate_prior_quarter_review | full LLM rewrite, reference impl | KLAIR-2711 |
| generate_gm_commentary | full LLM rewrite | KLAIR-2712 |
| generate_product_detail | tables stay deterministic, narrative subsection LLM-drafted via _draft_product_narrative (with already-built tables threaded through, no double-compute) | KLAIR-2713 |
| generate_minor_products_summary | ARR table stays, narrative subsection LLM-drafted via _draft_minor_narrative | KLAIR-2714 |
| generate_cf_plan | approved-MIPs branch unchanged; no-MIPs branch swaps dead goals_review read for the standard B9 context | KLAIR-2715 |
Dispatcher (generate_section) — round-2 retry / fallback contract:
- Accepts an optional context: SectionRefreshContext | None kw-only arg.
- _B9_CONTEXT_AWARE set routes the context through only to B9-aware generators; pre-B9 generators (FINANCIALS, MIPS, CUSTOM, EXEC_SUMMARY) silently bypass.
- 2-attempt retry loop catches only the typed retryable set: anthropic.APIError, httpx.HTTPError, asyncio.TimeoutError, ValueError (programming bugs like TypeError propagate so stack traces surface them).
- Empty / whitespace LLM result raises ValueError("Generator returned empty markdown") → retries → exhaustion-path preservation.
- On retry exhaustion: preserves context.current_content when non-empty (defends the operator's draft from destructive overwrite), else falls back to the existing placeholder. Always reports SectionResult(success=False) on failure so monitoring sees real LLM failures — round-1 reported success=True here, masking failures.
Caller updates (wizard_orchestrator._regenerate_section):
- Builds SectionRefreshContext from session.generated_sections + _focused_section_findings_block + _full_doc_block and passes it through generate_section — only for B9-context-aware sections (guard on _B9_CONTEXT_AWARE), so non-B9 regenerations skip the non-trivial full-doc-block assembly.
- Deletes the GM-Commentary _draft_gm_commentary + defensive reconciliation special case (now redundant — the LLM-first generator handles its own drafting and there's no spec.user_commentary["gm_narrative"] read to reconcile against).
- Retains the B8.2 LLM-fallback branch pending B9.7 (Phase 2 deletion).
Tests:
- tests/board_doc/test_b9_narrative_generators.py — 25 tests across the 5 generators + the dispatcher-level retry/exhaustion contract + the _extract_prior_narrative_subsection parser (including a regression pin for the Cloud-vs-CloudSense substring cross-contamination caught in round-2 review).
- Updated pre-B9 deterministic-echo tests in test_wizard_orchestrator.py to match the new propagation contract.
## Breaking changes
None for callers — the generators keep their (section, data, spec) positional signature. The new context is kw-only with a None default.
Behaviour change: in the rare case a session is still on a pre-4.0 wizard run that populated spec.user_commentary["gm_narrative"] / spec.user_commentary["goals_review"] / spec.product_commentary[name], regenerating those sections now drafts via LLM instead of echoing the wizard-authored text verbatim. The wizard-authored content reaches the LLM via the full_doc_block context (the cloned-doc body carries it), so it still grounds the refresh — just doesn't appear as exact verbatim output.
## Test plan
### Automated (run locally; all green on this branch)
- [x] uv run ruff format + uv run ruff check on changed files — clean.
- [x] uv run pyright on changed modules — 0 errors, 0 warnings.
- [x] uv run pytest tests/board_doc/test_b9_narrative_generators.py tests/board_doc/test_strip_leading_duplicate_heading.py tests/board_doc/test_wizard_orchestrator.py tests/board_doc/test_review_checks.py -q — 244 passed after round-2 review-fix commit (874adc6a9).
- [x] tests/board_doc/test_b9_narrative_generators.py — 25 tests covering all 5 rewritten generators + the dispatcher-level retry/exhaustion contract + the prior-narrative subsection parser. Reproduce with: uv run pytest tests/board_doc/test_b9_narrative_generators.py -v.
### Manual smoke (recommended before merging — touches the FE flow)
One smoke check is enough — PQR exercises the full B9 LLM-first contract end-to-end (SectionRefreshContext build → typed-generator dispatch → _PQR_SYSTEM prompt → LLM call → duplicate-heading strip → result render). The other 4 generators share the same machinery and are already pinned by the automated suite. If PQR works, B9.2 / B9.3 / B9.4 / B9.5 will too.
Setup once:
cd klair-api && uv run fast_endpoint.py
Then in klair-client/:
pnpm dev
Pick Skyvera, Q1 2026, paste a brainlift URL, open the resulting doc in the editor.
Smoke check:
- [ ] B9.1 — Prior Quarter Review: regenerate the PRIOR_QUARTER_REVIEW section. Pre-B9 this returned empty on cloned-from-prior sessions and triggered the generate_custom_section fallback in _regenerate_section. Expected post-B9: non-empty grounded markdown drafted via _PQR_SYSTEM prompt, with goal-by-goal evaluation of the prior quarter anchored on real numbers from build_key_metrics_block, followed by a brief bridge into the current quarter. Inspect server log for Generating section: prior_quarter_review followed by LLM usage — (the direct LLM call), NOT the B8.2 fallback line typed generator for ... returned 0 chars — falling back to generate_custom_section.
### What to look for in logs
- Server INFO log on a successful B9 regenerate: Generating section: <id> → LLM usage — input: X, output: Y → Generated section <id>: N chars. The LLM usage line confirms the direct LLM call landed instead of the deterministic echo path.
- Server INFO log on a B8.2 fallback (should be rare post-B9): regenerate_section: typed generator for ... returned 0 chars — falling back to generate_custom_section. Seeing this for PRIOR_QUARTER_REVIEW / GM_COMMENTARY / PRODUCT_DETAIL / MINOR_PRODUCTS_SUMMARY / CF_PLAN means the LLM call inside the typed generator failed AND current_content was empty (cold-start failure) — the dispatcher-level fallback ladder did its job, and the B8.2 branch is the second-line safety net pending B9.7.
- Server WARNING log on transient LLM failure with retry: Section <title> (<id>) failed on attempt 1 (<ExcType>) — retrying. Followed by either a successful retry (Generated section ...) or, after the second failure, ... exhausted retries + ... retries exhausted — preserving N chars of current_content (B9 fallback contract). The user sees the prior draft preserved instead of an empty section, and SectionResult.success=False propagates so monitoring sees the failure.
## Follow-ups
- B9.6 (KLAIR-2716): generate_mips LLM-first rewrite — still empty-when-no-spec.bu_mips, falls back through generate_custom_section via the retained B8.2 branch.
- B9.7: remove the B8.2 LLM-fallback branch from _regenerate_section now that typed generators handle drafting themselves.
- B9.8: deprecate spec.user_commentary / spec.bu_mips fields entirely (Phase 3 of the migration plan — needs a DDB-migration cycle).
- B9.9: rename feedback-override storage off user_commentary.
- B7.10 / B7.11 / B7.12 ([KLAIR-2761](https://linear.app/builder-team/issue/KLAIR-2761) / [KLAIR-2762](https://linear.app/builder-team/issue/KLAIR-2762) / [KLAIR-2763](https://linear.app/builder-team/issue/KLAIR-2763)): batch finding-addressal + per-finding user context + scope filter on ReviewPanel — Address-with-Claire UX overhaul filed during this PR's smoke test; bundles naturally with B7.6 for a future Phase B follow-up PR.