Linear: SURTR-24 (Phase 2) — follows [SURTR-23 / PR #88](https://github.com/AI-Builder-Team/Surtr/pull/88) (Phase 1: JournalEntry capture)
## Why
Phase 2 of the QB pipeline extension. Phase 1 captured JournalEntries — ~65% of alpha's P&L expense activity. Phase 2 closes the next ~25% by capturing the Purchase entity: direct credit-card / check / cash spend that doesn't flow through the AP cycle (no Bill, no BillPayment).
Combined Phase 1 + Phase 2 takes the alpha operating entity from 9% → ~99.5% P&L expense coverage at line level. Facility entities reach ~99-100% with the same code path.
Live dry-run capture for Q1 2026 across all 9 entities:
| Entity | Purchase headers |
|---|---|
| alpha | 19,671 |
| alpha_schools_llc | 219 |
| miami | 42 |
| sports_academy_78734_llc | 36 |
| sports_academy_75010_llc | 24 |
| gtschool_78626_llc | 16 |
| esports_academy_llc | 12 |
| alpha_school_28269_llc | 12 |
| alpha_school_93101_llc | 10 |
| Total | 20,042 |
alpha's 19,671 is dominated by Ramp Card spend — the same pattern that produced the original HR-Expenses 6¢ rounding observation that kicked off this whole work.
## What this PR delivers
### New table: staging_education.quickbooks_purchases
One row per Purchase line. Captures both the funding side (payment_account_id = Ramp Card / bank / etc.) and the expense side (account_id = the GL expense account being debited). Money columns are NUMERIC(18,4), picking up parse_money's precision pipeline.
credit_flag (sourced from the Purchase header's Credit bool) flips net_amount sign for refunds, so SUM(net_amount) gives the correct P&L contribution.
### Code changes
- qb_client.py — new query_purchases() using the same paginated _query_qb path as Bills/JEs
- handler.py — new _transform_to_purchase_records() with the same Phase-1 validation discipline:
- PaymentType validated against {CreditCard, Check, Cash}; invalid → log+skip whole purchase
- Header AccountRef.value (funding) validated; missing → log+skip whole purchase
- Per-line AccountRef.value (expense) validated; missing → log+skip line, others survive
- _parse_line_amount shared helper for Amount parsing (consistent with Bills/JEs)
- enumerate fallback for line_id (PK safety)
- Non-expense DetailTypes logged before skip (schema-drift visibility)
- handler.py — integrated as the 5th table_writes step
- redshift_handler.py — TABLE_CONFIGS["quickbooks_purchases"] entry matching DDL column order
- ddl/create_quickbooks_purchases.sql — DISTKEY(company_id), SORTKEY(txn_date, account_id), strict PK on (company_id, transaction_id, line_id)
- run_local.py — surfaces Purchase counts in the summary + adds a reconciliation hint for alpha Q1 2026 grouped by payment_type
- Updated module docstring at top of handler.py — the pipeline now syncs 5 tables, not 3
### Tests
- test_purchase_transform.py — 13 new tests (simple CC purchase, Credit: true refund sign-flip, Check, multi-line, ItemBasedExpenseLineDetail parity, invalid PaymentType skip, missing header AccountRef skip, missing line AccountRef skip, line_id enumerate, missing Amount → 0, optional header fields, empty Line[] skip, unparseable Amount → 0)
- test_handler.py — 5 existing TestHandler tests updated for the new 5th delete_and_insert call + query_purchases mock
## Out of scope (deliberate follow-ups — likely next tickets)
- Rename quickbooks-ap-sync → quickbooks-transactions-sync — the name is now actively misleading
- Phase 3 — Deposit / CreditCardCredit (covers the last 0.5%)
- Downstream quickbooks-core-tables UNION — makes Phase 1 + Phase 2 data visible to dashboards via fct_pl / fct_expense. This is the value-realization step.
- parse_money defensive replication to quickbooks-pl-monthly and quickbooks-expense-sync (needs shared-code infra)
## Test plan
- [x] Unit tests: 111 pass (was 98; +13 Purchase + 5 mock updates). ruff check clean, ruff format applied.
- [x] Level 1 — dry-run, all 9 entities, Q1 2026: 9/9 succeeded, 20,042 purchases captured with zero errors.
cd pipelines/runners/quickbooks-ap-syncpython3 run_local.py --quiet --start 2026-01-01 --end 2026-03-31 \
--company alpha --company alpha_schools_llc --company alpha_school_28269_llc \
--company alpha_school_93101_llc --company esports_academy_llc --company gtschool_78626_llc \
--company miami --company sports_academy_75010_llc --company sports_academy_78734_llc
- [x] Level 2 — SQL deploy: create_quickbooks_purchases.sql ran cleanly in production via renewals-pipeline/scripts/run_ddl.py.
- [x] Level 3 — smoke test (alpha 2026-05-20..21, --no-dry-run): 55 purchases / 56 lines written. Real vendors (Delta, DoorDash, CitizenM, Mason Security, Alo Yoga). By payment_type: 54 CreditCard ($168,342.87) + 1 Cash ($9,539.53). All money cols 4dp. credit_flag captured. No errors. No skipped lines.
-- Verified Level 3 query:SELECT payment_type, COUNT(*) AS lines, SUM(net_amount) AS net
FROM staging_education.quickbooks_purchases
WHERE company_id='alpha' AND txn_date BETWEEN '2026-05-20' AND '2026-05-21'
GROUP BY payment_type;
-- CreditCard 55 168342.87
-- Cash 1 9539.53
## Risk + rollback
| Change | Reversible? | Notes |
|---|---|---|
| create_quickbooks_purchases.sql | Yes — DROP TABLE staging_education.quickbooks_purchases | Empty new table at deploy, zero impact on existing data |
| Lambda code | Yes — git revert + CDK redeploy | Schema change is purely additive |
## Deploy ordering
Same as Phase 1: SQL must land before the Lambda's first run with the new code. SQL is already deployed (via the L2 step above). Once this PR merges and reaches production, CD redeploys the Lambda; the next scheduled run (or a manual invoke) starts populating quickbooks_purchases automatically via the default 30-month look-back.
🤖 Generated with [Claude Code](https://claude.com/claude-code)