## Screenshots
<img width="1917" height="943" alt="image" src="https://github.com/user-attachments/assets/7782fe79-5ddf-48f8-ade9-b99341230c2c" />
## Summary
First per-product Review Agent check (C3.3 — Engineering/Product cost vs the 9.5% Trilogy benchmark), bundled with the Address-with-Claire fixes (B7.5 / B7.5b) so the new finding shape doesn't ship into a known-broken affordance.
Linear: [KLAIR-2644](https://linear.app/builder-team/issue/KLAIR-2644)
## Why it's needed
C3.3 is the gold-standard manual implementation that anchors the C3 family. The C3.4-C3.9 siblings (KLAIR-2647 → KLAIR-2652) are tagged drone-eligible and will follow this module's structure as their pattern reference — so getting the module shape, naming, helpers, tests, and section-id resolution right *here* is what makes the drone runs cheap and predictable later. This PR is also the cost-attribution baseline for the drone POC: hand-rolled paired-session bookends are recorded in the commit message (~20 minutes start-to-PR) so the C3.4 drone run can be compared against a known-good human baseline.
B7.5 / B7.5b addresses a regression the C3 rail would otherwise ship into: review findings tagged to the Financials section (which is tables-only — no narrative slot) silently no-op when the user clicks "Address with Claire", because the existing path would call regenerate_section on a tables-only section and the generator would re-emit the same tables. Shipping C3.3 without B7.5/B7.5b means every C3.3 finding's primary CTA is broken. Bundled here on the user's call ([thread context](https://linear.app/builder-team/issue/KLAIR-2644)) to avoid the staggered regression window.
## Changes
### C3.3 — Engineering/Product per-product benchmark check
- klair-api/budget_bot/board_doc/review_checks/engineering_product_benchmark.py (new)
- Reads the Total → Engineering/Product row from the Benchmark by Product tab.
- Emits one finding per product column whose actual exceeds 9.5%.
- Verdict bands: pass ≤ 9.5%; warning in (9.5%, 14.5%]; critical > 14.5%.
- Rollup column (col 3 — <BU> Consolidated) → targets SectionType.FINANCIALS; other product columns → SectionType.PRODUCT_DETAIL with product_name (falls through to doc-level when the spec has no dedicated product section).
- All-products-pass emits a single BU-level pass finding so the scorecard reflects the check ran.
- Skip semantics distinguish "tab not loaded" / "row missing upstream" / "every product cell blank".
- klair-api/budget_bot/board_doc/review_checks/_helpers.py
- New shared helpers benchmarks_row_for(table, section, category) and benchmark_product_columns(header), designed so C3.4-C3.9 (drone-eligible) can fan out without re-inventing the indexing plumbing.
- klair-api/budget_bot/board_doc/review_checks/__init__.py — registers C3.3 with required_data=(BENCHMARK_BY_PRODUCT,).
### B7.5 — Tables-only awareness for Claire
- klair-api/budget_bot/board_doc/models.py
- New TABLES_ONLY_SECTION_TYPES = {FINANCIALS, CF_PLAN} constant + is_tables_only_section_type() helper, kept right next to the SectionType enum so additions can't drift.
- klair-api/budget_bot/board_doc/wizard_orchestrator.py
- Section inventory in the system prompt now marks tables-only sections with [TABLES-ONLY].
- New ## Section Authoring Constraints block teaches Claire (a) not to call regenerate_section on tables-only targets and (b) to route narrative remediation to PQR / MIPs / GM Commentary / per-product sections in that preference order.
- New _full_doc_findings_block renders a cross-section findings digest of every open actionable finding *outside* the focused section (one line each, severity-worst-first, capped at 4K chars / 40 lines). Pairs with the existing _focused_section_findings_block so cross-section asks no longer drop findings on the floor.
### B7.5b — FE polish for Address-with-Claire
- klair-client/src/screens/BoardDoc/utils/tablesOnlySectionTypes.ts (new) — FE mirror of the BE constant, kept in lockstep via the comment contract.
- klair-client/src/screens/BoardDoc/DocumentEditorPage.tsx
- handleAddressWithClaire now skips setFocusedSectionId + scrollToSection when the finding's target section is tables-only. The chat seed still carries the finding's section_id so Claire's prompt context anchors correctly server-side.
### Tests
- klair-api/tests/board_doc/test_engineering_product_benchmark.py (new — 15 tests): per-product fan-out, all four verdict boundaries (including the inclusive +5pp warning edge and the +5.01pp critical step), three skip paths, rollup vs per-product section-id resolution, and supporting-data shape.
- klair-api/tests/board_doc/test_b7_5_tables_only_and_doc_wide_findings.py (new — 21 tests): tables-only classification (including the parametrized "all writeable types stay writeable" guard), system-prompt directive ordering + omission-when-no-tables-only edge case, and the cross-section digest's focused-exclusion / severity-ordering / truncation semantics.
- klair-api/tests/board_doc/test_review_endpoint.py — _make_populated_data_package seeds BENCHMARK_BY_PRODUCT so C3.3 runs cleanly on the happy-path endpoint tests; the four affected assertions (findings count, expected check_id set, skipped-checks list, partial-completeness ran_ids) are updated accordingly.
- klair-client/src/screens/BoardDoc/__tests__/DocumentEditorPage.spec.tsx — the existing "Address with Claire opens chat, scrolls to section" test now targets a writeable gm_commentary section; a new B7.5b test pins the no-focus-shift behavior on a financials finding.
## Breaking changes
None. New check is additive (one more entry in REGISTRY); B7.5 only appends to the system prompt; B7.5b only conditionally skips a side effect. Pre-existing sessions / chat histories / serialised findings continue to round-trip unchanged.
## Test plan
- [x] pytest tests/board_doc -q — 1480 passed, 1 deselected, 0 failed.
- [x] pnpm vitest run src/screens/BoardDoc — 266 tests passed across 22 suites (two pre-existing broken syntax suites unrelated to this PR).
- [x] uv run ruff format / ruff check / pyright — clean.
- [x] pnpm tsc --noEmit / eslint --max-warnings 0 on touched FE files — clean.
- [ ] Manual smoke: run a review and verify the new C3.3 finding fires.
- [ ] Manual smoke: click "Address with Claire" on a finding and verify it produces some change to the document.
## Cost-attribution baseline (hand-rolled, for the C3.4 drone comparison)
- Paired-session start: 2026-05-13T08:48:54-05:00
- Paired-session end: 2026-05-13T09:09:10-05:00
- Elapsed: ~20m 16s
- C3.4 (KLAIR-2647) is the first drone target. After it runs, compare drone cost / wall-clock / iteration count against this baseline to ground the "is a drone shot worth it for this shape?" decision for C3.5-C3.9.