## Demo
Live, read-only validation of the runner against real QuickBooks + Redshift. The runner was swept twice over the entire Aerie selector (all 51 schools, SY 2025-26) ~11 hours apart, and the two runs together are the proof: run 1 flagged 5 real open-quarter sync-staleness cases and gated (exit 1); run 2, after the warehouse caught up overnight, came back penny-exact across all 51 schools (exit 0). Re-runnable and idempotent, exactly as designed. Every figure below is captured verbatim from a real run — nothing hand-edited.
### Run 2 — current state: clean sweep ✅ (2026-06-23T05:12 UTC)
python run_local.py --school all --sy 2025-26 (default redshift_reader source). All 51 rows are ✅ across all five periods; the body is elided (reproducible by re-running):
period=SY 2025-26 aerie_source=redshift_reader schools=51 mismatches=0School 2025-Q3 2025-Q4 2026-Q1 2026-Q2 SY2025/26 Overall worst Δ
Alpha Anywhere Center ✅ ✅ ✅ ✅ ✅ ✅ 0.00
… (49 rows elided — all ✅) …
Waypoint Academy ✅ ✅ ✅ ✅ ✅ ✅ 0.00
51 schools, 0 mismatches → EXIT 0 (all match)
- All 51 schools reconcile to the penny across every period, including the still-open 2026-Q2.
- Zero rows carry advisory tags — nothing stale left to flag.
### Run 1 — baseline ~11h earlier: gate fires on 5 open-quarter staleness cases (2026-06-22T17:57 UTC)
Same command, earlier snapshot. Trimmed to the 5 mismatch rows; the 46 matching rows are elided:
period=SY 2025-26 aerie_source=redshift_reader schools=51 mismatches=5school quarter qb_net aerie_net Δnet
Alpha Woodlands 2026-Q2 134994.74 105077.96 29916.78
Alpha School LLC 2026-Q2 3537069.76 3522025.18 15044.58
Alpha Roswell 2026-Q2 58495.40 57893.57 601.83
Alpha Tampa 2026-Q2 12153.61 11877.67 275.94
Alpha Southlake 2026-Q2 58149.89 58041.69 108.20
51 schools, 5 mismatches → EXIT 1 (mismatches)
- All 5 mismatches were isolated to 2026-Q2 — the still-open quarter. Every closed quarter (Q3'25, Q4'25, Q1'26) was clean for all 51 schools.
- Income matched exactly everywhere; the drift was expense-only. Non-zero exit on drift works (gate-ready).
### Why run 2 went green — sync latency, not drift
Re-running ~11h later, every one of the 5 gaps closed to exactly $0.00. The convergence proves the diagnosis: the warehouse was lagging live QB on operating-overhead accounts in the open quarter, and the Aerie side rose to meet QB ground truth:
| School | Run 1 Δ | Run 2 Δ | QB net (R1 → R2) | Aerie net (R1 → R2) |
|--------|--------:|--------:|------------------|---------------------|
| Alpha Woodlands | $29,916.78 | $0.00 | 134,994.74 → 134,994.74 *(unchanged)* | 105,077.96 → 134,994.74 |
| Alpha School LLC | $15,044.58 | $0.00 | 3,537,069.76 → 3,566,750.68 | 3,522,025.18 → 3,566,750.68 |
| Alpha Roswell | $601.83 | $0.00 | 58,495.40 → 58,536.04 | 57,893.57 → 58,536.04 |
| Alpha Southlake | $108.20 | $0.00 | 58,149.89 → 58,972.21 | 58,041.69 → 58,972.21 |
| Alpha Tampa | $275.94 | $0.00 | 12,153.61 → 12,153.61 *(unchanged)* | 11,877.67 → 12,153.61 |
For Woodlands and Tampa the QB ground truth was byte-identical across both runs — only the Aerie redshift figure moved up to match. That's a clean confirmation that the gaps were current-quarter sync latency, not a tool defect or a real accounting discrepancy.
### Per-school root cause — read-only account-level drill (from run 1)
For each flagged school, an import-and-call drill diffed live QB vs the synced warehouse per GL account. Every run-1 gap localized to a handful of operating-overhead accounts where the warehouse was stale on the most recent month(s) — precisely the accounts that converged in run 2:
| School | Run 1 Δ | Account(s) | Warehouse vs live QB (at run 1) |
|--------|--------:|-----------|----------------------|
| Alpha Roswell | $601.83 | 62400 Utilities | staging had Apr+May, June missing |
| Alpha Southlake | $108.20 | 62400 Utilities | all 3 months present, June value short |
| Alpha Tampa | $275.94 | 62300 Repairs + 62400 Utilities | both missing June |
| Alpha School LLC | $15,044.58 | 63220 Workshops + 64120 Software | Workshops a $19 June stub vs $14.7k in QB |
| Alpha Woodlands | $29,916.78 | 62200 Rent | staging had only May; Apr+Jun absent |
The deepest case, Alpha Woodlands / 62200 Rent, was a single account: live QB had Apr+May+Jun rent ($134,994.74) while the run-1 warehouse snapshot held only May ($105,077.96) — the full $29,916.78 gap. Run 2 shows that exact account caught up. Income matched throughout, the schools resolve cleanly (shared realms, no dedicated entity → no AERIE-434 surface), and every closed quarter tied to the penny in both runs.
### Transaction-level proof — which postings closed each gap, and when they were entered
One level below the account drill, the individual QuickBooks lines (staging_education.quickbooks_purchases / quickbooks_bills, which carry qb_created_time = when a line was keyed into QB) pin the latency to specific postings. Every run-1 gap is a handful of operating-overhead lines that were entered into QuickBooks on Jun 21 night / Jun 22 — after the sync window run 1's snapshot came from — even though the expenses are dated early/mid-June:
| School (raw QB class) | Account | txn_date | Amount | Entered into QB (qb_created_time) | Vendor |
|---|---|---|---:|---|---|
| Alpha Woodlands (Alpha The Woodlands) | 62200 Rent | Jun 15 | $29,916.78 | Jun 21 23:08 | 2000 WP Holdings |
| Alpha Roswell (Alpha Atlanta) | 62400 Utilities | Jun 3–4 | $111.40 + $247.17 + $243.26 = $601.83 | Jun 21 22:31–22:32 | Georgia Power, Colonial Energy |
| Alpha Tampa | 62300 Repairs + 62400 Utilities | Jun 9 | $173.91 + $102.03 = $275.94 | Jun 21 22:33–22:34 | Republic Services, TECO |
| Alpha Southlake | 62400 Utilities | Jun 1 | $108.20 | Jun 21 22:31 | Atmos Energy |
| Alpha School LLC | 63220 Workshops | Jun 19 | $14,751.43 | Jun 22 10:07 | Anvil Education |
Each per-account sum ties to the penny against the run-1 Δnet (Roswell $601.83, Tampa $275.94, Southlake $108.20, Woodlands $29,916.78). For Alpha School LLC the $14,751.43 Anvil bill is the bulk of the $15,044.58 gap (remainder = 64120 Software); the $19.26 Dollar Tree "stub" run 1 saw was entered Jun 13 — before the snapshot — which is exactly why the account read as a "$19 June stub vs $14.7k in QB".
These lines first reached quickbooks_pl_monthly with a warehouse updated_at of 2026-06-23 04:03–04:09 UTC — this morning's load, landing between run 1 and run 2. The pipeline runs cron(0 4 * * ? *) (daily at 04:00 UTC); the postings were keyed into QB only on Jun 21 night / Jun 22, after the prior day's sync window, so run 1's snapshot predated them and the overnight sync captured them for run 2. Incurred-in-June expenses, data-entered ~3 weeks later, ingested on the next daily sync — current-quarter data-entry latency, now confirmed at the transaction level rather than inferred. (Read-only throughout: quickbooks_purchases / quickbooks_bills queried via the explore-qb drill.)
> One resolution note surfaced by the drill: in Redshift staging the two split campuses sit under their raw QB class labels — Roswell under Alpha Atlanta, Woodlands under Alpha The Woodlands — and are only canonicalized to their display names inside Aerie (campus-name-aliases.ts). The runner's equivalent-class union already handles this; direct warehouse queries must use the raw labels.
### The fix in this branch — frozen-stale-closed-month is now date-aware
Before, _is_closed_quarter checked only the YYYY-Qn label *shape*, so it tagged the current quarter as a closed/stale month. Now it compares the quarter's end date to today. In run 1 the open-quarter rows correctly carried only additive-only-orphan (not frozen-stale-closed-month); in run 2, with the staleness gone, no row carries either tag:
run 1 (Alpha Roswell, 2026-Q2): ['additive-only-orphan'] ← open quarter NOT mislabeled stalerun 2 (Alpha Roswell, 2026-Q2): [] ← gap closed, nothing to tag
Most at risk from this change: tag_mismatch (called by run_local.py) and the diff → periods import. Verified — the today kwarg is optional (existing callers unchanged), tagging stays wrapped so it can never break a run, and both the scoped diff suite and the full runner suite pass:
tests/test_diff.py 12 passed (open vs closed quarter boundary covered)full runner suite 71 passed (+ 429 backoff cap-exhaustion test)
---
## Overview
A new on-demand, read-only CLI runner — pipelines/runners/qb-aerie-pl-reconciliation/ — that reconciles live QuickBooks P&L (ground truth) against the Aerie "Actual vs Model (Schools)" page output, per (school, quarter), for every school in the Aerie selector. It is re-runnable anytime and exits non-zero on drift so it can gate a job later.
This formalizes the manual /explore-qb cross-check that surfaced AERIE-434 (a dedicated entity's unclassed Not Specified rows dropped between Redshift and the rendered page — Q1'26 tuition showed $307,500 vs the true $446,807). What was a one-off manual check for a single school is now systematic and repeatable across all ~60 schools.
Linear: [SURTR-223 — Re-runnable QB↔Aerie P&L reconciliation across all schools (Actual vs Model)](https://linear.app/builder-team/issue/SURTR-223)
The comparison is QB ↔ Aerie end-to-end (QB → sync → Redshift → Aerie reader → page), so both reader-layer and sync-layer bugs are caught. It does not localize which layer drifted — a flagged school is the signal to investigate.
## Specs (all implemented + tested)
| # | Spec | What it does |
|---|------|--------------|
| 01 | foundation-resolution-and-periods | School/entity resolution (3-realm + dedicated-whole-realm + alias union over core_education.map_campus_qb_entities), selector parity (seed minus Aerie EXCLUDED_SCHOOLS), SY→quarter periods, and the shared ReconRecord contract (Decimal money). |
| 02 | qb-ground-truth-fetcher | Live QB fetcher reusing vendored quickbooks-pl-monthly qb_client/secrets; per-school entity-set union → (school, quarter) rollup; shared-realm caching + 429 backoff; strictly read-only. Income→income, COGS+Expenses→expense, "Other" excluded. |
| 03 | aerie-black-box-fetcher | Pluggable AerieSource. DEFAULT RedshiftReaderAerieSource (replays the reader rollup over staging_education.quickbooks_pl_monthly); ConvexAerieSource is the true end-to-end path, BLOCKED ON AERIE. Every report stamps which aerie_source was used. |
| 04 | diff-engine-cli-and-report | Per-(school,quarter) diff with tolerance + advisory taxonomy tags; CLI run_local.py; stdout ✅/❌ table; JSON+CSV artifact; non-zero exit on drift; README. |
## Implementation summary
- Resolution. A school is not a single QB class. Its P&L is the union of its class_name rows in alpha (if in_alpha), its class_name rows in alpha_schools_llc (if in_alpha_schools_llc), and — if it has a dedicated_qb_account — the entire dedicated realm (including unclassed / Not Specified rows, exactly what AERIE-434 dropped). class_name matching uses the equivalent-class set C (campus label + campus-name aliases + the three seed-docstring label fixes), so split campuses roll up to one school identity matching Aerie's. The selector = seed campuses minus Aerie's EXCLUDED_SCHOOLS.
- Periods. Surtr has no SY-quarter util (fct_pl.quarter is a plain calendar YYYY-Qn), so the SY→calendar-quarter mapping is defined fresh (SY2025/26 = Q3'25, Q4'25, Q1'26, Q2'26 + SY total); --sy / --period parsing.
- QB fetcher. Reuses the vendored quickbooks-pl-monthly building blocks. Shared realms (alpha, alpha_schools_llc) are fetched once per (realm, quarter) and cached, then sliced per school by class; dedicated realms are fetched per school whole. Tokens are reused per company across quarters, fetches run sequentially, and 429s back off exponentially. Read-only — never writes to QB or any DB and never invokes the pl-monthly --full/handler path.
- Diff engine. Joins QB-side and Aerie-side ReconRecords on (school, quarter), computes Δ = qb - aerie per income/expense/net in Decimal, applies tolerance, classifies match/mismatch (recording magnitude, direction, and which half drifted), flags one-sided rows as explicit mismatches, and best-effort tags mismatches against the known discrepancy taxonomy (incl. the AERIE-434 dedicated-entity-unclassed-rows pattern). Tagging is advisory and never blocks.
- CLI / report. run_local.py orchestrates resolve → fetch QB → fetch Aerie → diff → render. Report prints a per-school ✅/❌ table and writes a JSON or CSV artifact with full per-(school, quarter) deltas under a header recording SY/period, tolerance, the aerie_source used, timestamp, school count, and mismatch count. Non-zero exit when any school exceeds tolerance.
## ⚠️ Aerie source: working DEFAULT vs BLOCKED end-to-end path
The Aerie read seam is the feature's one outstanding external dependency. There is no Aerie Convex client, deploy key, service token, or HTTP path anywhere in Surtr's pipelines/, so the Actual side is built behind a pluggable AerieSource with two backends:
- RedshiftReaderAerieSource — the interim DEFAULT, runnable from Surtr today. Replays Aerie's reader rollup over the same synced Redshift staging tables Aerie reads (staging_education.quickbooks_pl_monthly). Structural limitation: it reconciles QB ↔ *Redshift-synced-data*, not QB ↔ *rendered page*, so it will MISS reader-layer bugs — including AERIE-434 itself, the bug that motivated this feature, which lives between Redshift and the page. It still catches sync-layer drift.
- ConvexAerieSource — the TRUE end-to-end target, BLOCKED ON AERIE. Calls a read-only Aerie Convex query returning per-school Actual quarter totals, authenticated with an Aerie service token / deploy key. This is the genuine QB↔rendered-page path that catches reader-layer bugs. Aerie does not yet expose this query, so it is implemented against the target contract and fails fast if unconfigured — no silent fallback. Selected via --aerie-source convex.
Every report records which aerie_source produced the Actual side, so a reconciliation done against Redshift is never mistaken for an end-to-end (rendered-page) check. This dependency is recorded as the top item in the feature's Open Questions / Risks.
## Test coverage
68 unit tests under the runner's tests/, all green:
uv run pytest pipelines/runners/qb-aerie-pl-reconciliation/tests/
Covering:
- money / periods / resolution — Decimal money conventions, SY→quarter windowing, the 3-realm + dedicated-whole-realm + alias resolution union and selector parity.
- qb-fetcher — account_type bucketing (income vs COGS+Expenses, "Other" excluded), shared-realm (realm, quarter) caching, and 429 exponential backoff.
- aerie-reader — Redshift-reader parity with the page rollup.
- factory + convex fail-fast — --aerie-source selection and ConvexAerieSource failing fast when unconfigured.
- diff — tolerance (abs/rel, sub-$0.50 collapse), missing-side handling, and the discrepancy taxonomy including the AERIE-434 pattern.
- report + orchestration — artifact shape/header and the end-to-end wiring, including the single-school single-quarter smoke path.
## Alignment fix, self-review, CI
- Alignment fix: ConvexAerieSource's constructor rejected forwarded resolution kwargs, which would have crashed --aerie-source convex. Fixed and covered by a test.
- Self-review: no correctness issues found.
- CI: all checks green.
## How to run
Single-school single-quarter smoke check first:
cd pipelines/runners/qb-aerie-pl-reconciliationpython run_local.py --school "Alpha Anywhere Center" --period 2026-Q1
Then the full sweep across every selector school for the school year, writing an artifact:
python run_local.py --school all --sy 2025-26 --tolerance 1.00 --format table --out report.json
Read-only and idempotent. Exits non-zero if any school exceeds tolerance. Use --aerie-source convex once Aerie ships the read-only query for the true end-to-end check.
🤖 Generated with [Claude Code](https://claude.com/claude-code)