<!-- CURSOR_AGENT_PR_BODY_BEGIN -->
---
linear_id: KLAIR-2760
---
## Summary
Extend read_google_doc_sections in klair-api/budget_bot/board_doc/gdoc_sync.py so a NORMAL_TEXT paragraph whose runs are entirely bold gets promoted to heading_level=2 when it matches a conservative "section label" heuristic. GM-authored docs that use bold paragraphs (instead of Heading 2 styles) for top-level section labels no longer have their overview / commentary content silently dropped on prior-quarter import.
## Why it's needed
Several GM-authored board docs use a bold NORMAL_TEXT paragraph (e.g. "Skyvera Overall:", "GM Commentary") as a section label instead of an explicit HEADING_2 style. The strict-heading parser silently dropped any content sitting under such a label because the paragraph loop only opened a new section on HEADING_*. On prior-quarter import that lost the entire overview block.
The earlier strict-heading rule was guarding against SMART-style labels ("Specific:" followed by non-bold goal text in the same paragraph) being falsely promoted. The new four-condition heuristic — bold-only runs across the whole paragraph, length ≤ 60, and either a trailing ":" or an allowlist token (overall, commentary, highlights, performance, plan) — picks up genuine GM section labels while still rejecting SMART inline-bold runs.
## Changes
- klair-api/budget_bot/board_doc/gdoc_sync.py
- New module-level constants: _PROMOTABLE_BOLD_MAX_LEN = 60, _PROMOTABLE_BOLD_PREFIXES = frozenset({"overall", "commentary", "highlights", "performance", "plan"}), PROMOTE_BOLD_PARAGRAPHS_DEFAULT = True (kill-switch).
- New helper _is_promotable_bold_paragraph(paragraph: dict) -> bool implementing the four-condition heuristic. Whitespace-only textRuns (e.g. the trailing "\n" Google Docs appends) do not disqualify the bold check.
- read_google_doc_sections(document_id, *, promote_bold_paragraphs: bool | None = None) — new keyword-only flag; resolves to PROMOTE_BOLD_PARAGRAPHS_DEFAULT when None. Promotion branch sits between the heading branch and the body-append branch, calls _flush_section() before opening the new section, logs at INFO once per promotion, and continues to avoid double-appending the bold text into the new section's body.
- Updated the stale # Content before the first heading is ignored ... comment to acknowledge promoted bold paragraphs.
- klair-api/tests/board_doc/fixtures/gdoc_bold_paragraph_headings.json (new) — three synthetic document shapes: skyvera_overall_bold_label, tologi_commentary_bold_label, smart_inline_labels.
- klair-api/tests/board_doc/test_gdoc_sync.py — 28 new tests across TestIsPromotableBoldParagraph (helper-level) and TestReadGoogleDocSectionsBoldPromotion (end-to-end through read_google_doc_sections). Coverage includes every condition's positive and negative case, the feature-flag default-on / explicit-off paths, the INFO log on promotion, and the case-insensitive allowlist match. Existing 30 tests continue to pass unchanged.
The sole production caller (wizard_orchestrator.py:8478) is unchanged — the default-ON flag propagates the new behaviour for free.
## Breaking changes
None. The new keyword-only parameter is optional, the existing positional signature still works, and existing fixtures/tests in TestParseGoogleDocSections continue to pass with default-ON because none of them include a NORMAL_TEXT bold-only paragraph that would match the heuristic.
## Test plan
cd klair-api && uv run ruff format budget_bot/board_doc/gdoc_sync.py tests/board_doc/test_gdoc_sync.pycd klair-api && uv run ruff check budget_bot/board_doc/gdoc_sync.py tests/board_doc/test_gdoc_sync.py
cd klair-api && uv run pyright budget_bot/board_doc/gdoc_sync.py
cd klair-api && uv run pytest tests/board_doc/test_gdoc_sync.py -v
cd klair-api && uv run pytest tests/board_doc -q --timeout=120
Results:
- Format: 2 files already formatted (after one initial reformat of the test file).
- Ruff check: All checks passed!.
- Pyright: 0 errors, 0 warnings, 0 informations.
- Focused suite (test_gdoc_sync.py): 58 passed (30 pre-existing + 28 new), 1 unrelated botocore deprecation warning.
- Broader board_doc sweep: 1747 passed, 8 failed, 1 deselected in 116s. The 8 failures are the same pre-existing flakes as on main (cross-checked by running the same sweep on main and observing the identical 8 names): test_review_findings::TestResolveSectionId::test_duplicate_section_type_*, test_saas_it_ops_benchmark::*, test_sales_marketing_benchmark::TestRaggedRowDriftWarning::*, test_section_crud_endpoints::TestPatchSectionCustomTransitionWarning::*, and test_support_benchmark::*. None of them are in test_gdoc_sync.py and none touch the parsing path this PR modifies.
For the verification artifact below I used a tiny /tmp/dump_promotion.py helper that imports read_google_doc_sections, mocks services.gdoc_service, loads the skyvera_overall_bold_label fixture, and prints result.sections as JSON. The helper is scaffolding only; it is not committed.
### Comment-sweep grep
- rg -n "before the first heading" klair-api/budget_bot/board_doc/gdoc_sync.py → only match is the updated comment that mentions promoted bold paragraphs (line 462).
- rg -n "HEADING_" klair-api/budget_bot/board_doc/gdoc_sync.py → all matches are in the existing heading-detection path (line 421) or the new B1.8 module docstring describing the contrast with HEADING_2 style. No stale "only HEADING_* paragraphs become sections" copy.
- rg -n "title area" klair-api/budget_bot/board_doc/gdoc_sync.py → resolves to the same updated comment (line 463).
## Verification artifact
Here is the section dict produced by read_google_doc_sections on the skyvera_overall_bold_label fixture, showing the promoted section ahead of the real H2 Goals section:
{"skyvera_overall": {
"title": "Skyvera Overall:",
"content": "Strong quarter overall, ARR up 14% versus prior year.",
"heading_level": 2,
"start_index": 27,
"end_index": 99,
"heading_start_index": 27,
"heading_end_index": 45
},
"goals": {
"title": "Goals",
"content": "Hit $40M ARR by end of Q3 2026.",
"heading_level": 2,
"start_index": 99,
"end_index": 145,
"heading_start_index": 99,
"heading_end_index": 105
}
}
Note: (a) the promoted skyvera_overall section_id has non-empty content ("Strong quarter overall, ARR up 14% versus prior year."), (b) the subsequent real HEADING_2 goals section appears in the same dict, and (c) the promoted entry carries heading_level: 2 metadata identical to the real H2.
## Out of scope
- No changes to wizard_orchestrator.py — default-ON propagates the new behaviour to the sole production caller without per-callsite churn.
- No changes to the section-id slug generation (_make_section_id) — promoted bold paragraphs use the same slug pipeline as real H2 headings.
- No backfill of already-imported sessions whose prior-quarter doc was missing top sections — the heuristic applies forward, from the next call to read_google_doc_sections after merge.
- No env-var-driven feature flag override; PROMOTE_BOLD_PARAGRAPHS_DEFAULT is the kill switch. If a specific session needs the flag flipped off, the caller passes promote_bold_paragraphs=False explicitly. An env-var bridge is a future B1.x ticket if it earns its keep.
- No changes to detect_external_changes, sync_to_google_doc, clone_google_doc, or export_google_doc_as_docx in gdoc_sync.py — this PR touches only the read/parse path.
- No fix to the pre-existing except Exception at klair-api/budget_bot/board_doc/gdoc_sync.py:383 (now :482 after this PR) inside detect_external_changes. That swallow-and-re-raise pattern was outside this PR's scope; surfacing here as a follow-up so it isn't silently fixed.
## Spec note
The drone spec text in condition 5 says "Prefix match is on the first whitespace-delimited token", but both the listed fixture (tologi_commentary_bold_label — "GM Commentary" qualifying via commentary prefix) and the rubric's positive example ("GM Commentary") require a match on a non-first token. I followed the test rubric and fixture as the source of truth: the implementation matches any whitespace-delimited token, case-insensitive, against _PROMOTABLE_BOLD_PREFIXES. The required docstring (preserved verbatim) still says "first word" — flag for human review if that wording should be tightened. Both the "Skyvera Overall:" and "GM Commentary" positive cases pass, and "Some Random Label" is correctly rejected.
<!-- CURSOR_AGENT_PR_BODY_END -->
<div><a href="https://cursor.com/agents/bc-cd9b4c4a-595f-4d11-9cc1-3b6b54e9cbc4"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-web-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-web-light.png"><img alt="Open in Web" width="114" height="28" src="https://cursor.com/assets/images/open-in-web-dark.png"></picture></a> <a href="https://cursor.com/background-agent?bcId=bc-cd9b4c4a-595f-4d11-9cc1-3b6b54e9cbc4"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-cursor-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-cursor-light.png"><img alt="Open in Cursor" width="131" height="28" src="https://cursor.com/assets/images/open-in-cursor-dark.png"></picture></a> </div>