Linear: [SURTR-23](https://linear.app/builder-team/issue/SURTR-23/capture-quickbooks-journalentry-lines-fix-money-precision)
## Why
The quickbooks-ap-sync pipeline currently captures only Bills, VendorCredits, and BillPayments at line level — that's ~9% of P&L expense activity on the alpha entity (Q1 2026, $1.27M of $14.28M). The other ~91% is invisible at transaction level despite the monthly aggregate (quickbooks_pl_monthly) reconciling correctly with QuickBooks.
GeneralLedger analysis across alpha, miami, and alpha_schools_llc showed the entity-type split:
| TxnType | alpha % of P&L | miami % of P&L | alpha_schools_llc % of P&L |
|---|---|---|---|
| Journal Entry | 65.3% | 8.0% | 18.4% |
| Expense (Purchase) | 24.8% | 5.9% | 12.5% |
| Bill | 8.7% | 84.5% | 67.4% |
For alpha specifically, accounts 60211/60212/60220/60250/60200 (Contracted Labor — Coaches) and 69100 (Central Factory Recharges) are 100% Journal Entries — $7M+/quarter of XO contractor payroll postings that are completely invisible to current Bills-only capture.
Adding JournalEntry capture takes overall P&L coverage from 9% → 75%. Adding Purchase (the follow-up Phase 2) gets to 99.5%.
## What this PR delivers (Phase 1)
Two logically separable changes bundled in one PR for review efficiency.
### Phase 0 — money-precision fix (prerequisite)
- New src/money.py helper: parse_money(value) -> Decimal with HALF_UP rounding at 4dp.
- Wired through every existing money field in the 3 transformers (Bills / AP / BillPayments).
- New JSON encoder in redshift_handler.py emitting Decimals as JSON numbers (via float(), safe since values are pre-quantized).
Root cause this fixes: QB sometimes returns amounts with float-representation drift (e.g. 28.879999... for $28.88). The old code passed the raw float through json.dumps → S3 → COPY, where Redshift NUMERIC(18,2) truncates the excess precision instead of rounding. Result: bills booked at $28.88 in QB stored as $28.87. Verified against Cristy Cunningham HR Expense bills where the QB report total differed by exactly 6¢ from our Redshift sum across 7 affected lines.
No column migration needed. parse_money() already quantizes to 4dp via Decimal then serializes back to a clean 2dp-friendly JSON number (28.879999... → 28.88 exactly). Redshift COPY into the existing NUMERIC(18,2) columns receives the clean value and stores it without truncation. Widening columns to NUMERIC(18,4) was over-engineering; Redshift also doesn't support inline ALTER COLUMN TYPE for numeric here, so the add/update/drop/rename dance wouldn't be worth the disruption for a non-essential safety margin.
### Phase 1 — JournalEntry pipeline extension
- New query_journal_entries() in qb_client.py using the same paginated _query_qb path as Bills/VendorCredits/BillPayments.
- New _transform_to_je_records() in handler.py emitting one row per JE line (header repeated). Captures posting_type, debit_amount/credit_amount/net_amount, account refs, class/department refs, optional entity ref, and per-line description (which carries the contractor name in XO payroll JEs).
- Integration into the per-entity loop as a fourth table_writes step.
- TABLE_CONFIGS entry for quickbooks_journal_entries.
- ddl/create_quickbooks_journal_entries.sql — new table with NUMERIC(18,4) money columns (free since it's a fresh table), DISTKEY(company_id), SORTKEY(txn_date, account_id).
- 9 new transformer tests against real QB JE shapes (balanced 2-line, XO multi-credit payroll, adjustment, entity ref, money rounding, optional fields, empty lines, non-JE-detail skip, multi-JE concat).
- Updated 5 existing handler tests to mock the new fetch + assert the 4th delete_and_insert call.
### Tooling
- New run_local.py — Level 1 dry-run driver. Mirrors the quickbooks-pl-monthly/run_local.py pattern. Calls handler() with dry_run=True by default; fetches from real QB but skips Redshift writes. Sensible env defaults including ENVIRONMENT=prod. Live-validated across all 9 entities (see Test Plan).
## Out of scope (deliberate, follow-up tickets)
- Rename quickbooks-ap-sync → quickbooks-transactions-sync (name no longer matches scope).
- Phase 2 — Capture Purchase entity (Cash/Check/CreditCard expenses; closes another ~25% of alpha P&L).
- Phase 3 — Capture Deposit / CreditCardCredit (last 0.5%).
- Downstream quickbooks-core-tables update to UNION the new JE staging table into fct_pl / fct_expense.
- Apply the same parse_money helper to quickbooks-pl-monthly and quickbooks-expense-sync (requires shared-code infrastructure since each pipeline bundles independently).
## Test plan
- [x] Unit tests: 85 pass (was 76; +9 JE + 17 money). ruff check clean, ruff format applied.
- [x] Level 1 — dry-run against real QB, all 9 entities (Q1 2026): 9/9 succeeded, 2,601 Bills + 27 VendorCredits + 2,371 BillPayments + 625 JournalEntries fetched with zero errors. Alpha alone returned 358 JE headers (~6,300 line-level rows after transformation), matching prior GL analysis.
cd pipelines/runners/quickbooks-ap-syncpython3 run_local.py --quiet # miami, default
python3 run_local.py --quiet --company alpha # large entity, ~30s
- [ ] Level 2 — SQL deploy:
- Run pipelines/runners/quickbooks-ap-sync/ddl/create_quickbooks_journal_entries.sql once. Single CREATE TABLE IF NOT EXISTS — idempotent, no impact on existing tables.
- [ ] Level 3 — post-deploy smoke test (manual Lambda invoke):
aws lambda invoke \--function-name pipeline-quickbooks-ap-sync-prod \
--cli-binary-format raw-in-base64-out \
--payload '{"run_id":"post-deploy-smoke","params":{"company_ids":["miami"],"start_date":"2026-05-01","end_date":"2026-05-15"}}' \
/tmp/out.json
- [ ] Reconciliation — after a full alpha Q1 backfill (next daily run or wider-window manual invoke):
SELECT SUM(debit_amount) - SUM(credit_amount) AS netFROM staging_education.quickbooks_journal_entries
WHERE company_id = 'alpha'
AND account_id = '211'
AND txn_date BETWEEN '2026-01-01' AND '2026-03-31';
-- Expected: 2088610.84 (matches QB GL exactly)
## Risk + rollback
| Change | Reversible? | Notes |
|---|---|---|
| create_quickbooks_journal_entries.sql | Yes — DROP TABLE | Empty new table, zero impact on existing data |
| Lambda code | Yes — git revert + CDK redeploy | New table is the only schema change |
## Deploy ordering
SQL must land before the Lambda's first run against the new code path. Given the current ~20 hours until next 03:00 UTC scheduled run, the standard sequence is safe:
1. Merge PR → CD auto-deploys Lambda (deployed but unfired — don't manually invoke)
2. Run create_quickbooks_journal_entries.sql
3. Manual Lambda invoke for smoke test
4. Reconciliation query
5. Let the next scheduled run pick up wider backfill on its own (default 30-month look-back), or trigger manual backfill
🤖 Generated with [Claude Code](https://claude.com/claude-code)