## Screenshots
<img width="1919" height="933" alt="image" src="https://github.com/user-attachments/assets/d7c7c179-1b34-4f7b-afe9-a72b018be5a5" />
## Summary
- Ships Phase C: Budget Review Agent P&L checks — 5 new deterministic review checks (C2.3, C2.4, C2.5, C2.7, C2.8) bringing the registry from 2 → 7 checks (out of the 17-check MVP target).
- Lands the C1.9 review-tab data plumbing: wires the Top Level View - BU Plans and Benchmark by Product Google Sheets tabs end-to-end through the orchestrator into CanonicalBudgetPlan (top_level_view, benchmarks_by_product fields), unblocking the TLV-dependent checks.
- The /review endpoint auto-runs the entire registry — opening any BU's REVIEW phase and clicking "Run Review" now fires all 7 checks (or surfaces typed skip reasons when upstream data is sparse).
## Why it's needed
The Budget Review Agent MVP needs deterministic checks to fire alongside the (already-shipped) scorecard UI before Memorial Day cut. Today only 2 of 17 are live (margin target, margin trajectory). This PR ships every P&L check that isn't blocked on Finance work, automating the questions Andy Price asks on every plan review:
- "Did the BU's plan deteriorate vs. last quarter's plan?" → C2.3 / C2.4 (revenue + EBITDA plan-on-plan)
- "Did the cost base flex with the top line?" → C2.5 (operating leverage)
- "Why does the BU plan disagree with FP&A's Hybrid overlay?" → C2.7 (BU-vs-Hybrid divergence)
- "Is the FY plan honest or back-loaded into Q4?" → C2.8 (hockey-stick detection)
Each check is registry-driven, isolated (per-check exceptions can't 500 the request), and emits structured findings with severity / supporting data / remediation options that the scorecard rail renders and Claire can pull into chat context.
## Changes
### C1.9 — Review-tab data plumbing
- models.py: add DataSourceKey.TOP_LEVEL_VIEW_BU_PLANS and BENCHMARK_BY_PRODUCT.
- data_orchestrator.py: new _fetch_top_level_view_bu_plans + _fetch_benchmark_by_product async fetchers; both treated as gsheets sources (rate-limit retries, staggering).
- canonical_plan.py: PlanFinancials.top_level_view + benchmarks_by_product populated from the package; PlanCompleteness.has_top_level_view + has_benchmarks_by_product flags; both marked optional (absence doesn't dirty missing_sources).
- wizard_orchestrator.py: friendly source descriptions for Claire's prompt context.
### C2.3 — Plan-on-plan revenue deterioration
- Reads top_level_view Total Revenue from the <BU> Overall rollup (falls back to single-section sheets like Totogi's).
- Bands: ≥−1pp pass, (−1pp,−5pp] warning, <−5pp critical. Skips on TLV absent or previous plan ≤ 0.
### C2.4 — Plan-on-plan EBITDA deterioration
- Same data path as C2.3 against EBITDA row. Wider dead-band (−2pp pass / (−2pp,−10pp] warning / <−10pp critical) to reflect EBITDA volatility on small denominators.
- Sign-aware skip when previous EBITDA ≤ 0 (% math would flip sign; absolute-dollar variant tracked separately for loss-stage BUs).
- Cross-references C2.3 in remediation copy.
### C2.5 — Cost growth outpacing revenue decline
- Q-over-Q operating-leverage check. total_costs = revenue − EBITDA (robust to per-BU sheet line-item drift).
- leverage_gap_pp = cost_growth_pct − revenue_growth_pct: negative = costs flexing faster than revenue (good); positive = costs failing to follow revenue down.
- Bands: revenue growing/flat → pass (premise inapplicable); revenue declining + gap ≤ 0.5pp → pass; gap (0.5pp,3pp] → warning; gap > 3pp OR sign mismatch (revenue down + costs up) → critical.
- Cross-references C2.6 to scope COGS-vs-OpEx differential.
### C2.7 — BU Plan vs Hybrid Plan divergence
- Reads current_quarter_pnl col 1 (BU Plan) vs col 2 (Hybrid Plan) for Total Revenue + EBITDA.
- Headline = max(|revenue_gap_pct|, |ebitda_gap_pct|). Bands 0–2% pass, (2%,10%] warning, > 10% critical.
- Direction-aware framing ("more aggressive" vs "more conservative" — different defences).
- Skips single-column sheets (older BUs without Hybrid) and non-positive Hybrid baselines.
- Cross-references C2.3 in remediation copy.
### C2.8 — FY trajectory coherence (hockey-stick detection)
- Reads top_level_view Current BU Plan Q1-Q4 Total Revenue. Closed-form OLS fit through (Q1, Q2, Q3) projects Q4_expected; ramp_excess_pct = (Q4_actual − Q4_expected) / Q4_expected * 100.
- Symmetric bands: |excess| ≤ 10% pass, (10%,25%] warning, > 25% critical. Detects both hockey-stick (Q4 above trend) AND reverse-hockey-stick (Q4 below trend).
- Q4-share-of-FY surfaced in supporting data as a secondary signal.
- Dependency-free 3-point fit (no numpy import).
- Cross-references C2.7 in remediation copy.
### Shared infrastructure
- review_checks/_helpers.py: three new TLV helpers (tlv_overall_row, tlv_find_column, tlv_cell_value) shared by C2.3 / C2.4 / C2.8 — handle Overall-section preference, group/quarter column resolution, and cell-value parsing (currency / accounting parens / whitespace).
- review_checks/__init__.py: registry expanded to 7 entries; each carries required_data so the endpoint's extra_required top-up fetches everything any check needs even when the session's spec doesn't declare it.
### Tests
- New test files: test_plan_on_plan_checks.py, test_cost_vs_revenue_trajectory.py, test_bu_vs_hybrid_divergence.py, test_fy_trajectory_coherence.py — full verdict matrix, boundary tests, skip-path coverage, JSON round-trip pins.
- Updated test_review_checks.py registry tests, test_review_endpoint.py integration tests (now exercises all 7 checks against a populated session).
## Breaking changes
None. Two new optional DataSourceKey enum members and two new optional fields on PlanFinancials / PlanCompleteness — additive only; existing callers see no behaviour change.
## Test plan
- [x] uv run ruff format clean on all touched files
- [x] uv run ruff check clean on all touched files
- [x] uv run pyright clean (0 errors, 0 warnings) on all new files
- [x] uv run pytest tests/board_doc/ — 1439 passed, 1 deselected
- [ ] Manual: open a Skyvera Q2 doc, advance to REVIEW phase, click "Run Review" — verify scorecard renders 7 findings with correct severity grouping and Address-with-Claire CTAs on criticals
## Follow-ups (out of scope for this PR)
- C2.2 (EBITDA test H1 + FY) — blocked on Finance work tracked as C1.10 (H1 target plumbing, Q3 cycle). FY-half could ship without C1.10 if scoped.
- C3.1–C3.9 — 9 per-product benchmark checks against plan.financials.benchmarks_by_product (data plumbed by this PR; checks themselves are the next epic).
- Absolute-dollar variants of C2.4 / C2.7 for loss-stage BUs (currently skip with typed reason).