Vol. I  ·  No. 173 Established 2026  ·  AI-Generated Daily Free to Read  ·  Free to Print

The Trilogy Times

All the news that's fit to generate  —  AI • Business • Innovation
MONDAY, JUNE 22, 2026 Powered by Anthropic Claude  ·  Published on Klair Trilogy International © 2026
🖶 Download PDF 🖿 Print 📰 All Editions
Today's Edition

Cut-Rate Chinese AI Stuns Silicon Valley

DeepSeek claims it built top-tier models on the cheap and without America's best chips — and the whole AI money machine just flinched.

HANGZHOU, CHINA — A Chinese upstart called DeepSeek says it trained high-performing artificial intelligence on the cheap, without the fanciest chips, and Silicon Valley can't stop talking.

The Valley crowd calls the work "amazing and impressive." That's a peculiar tune to hear hummed about a model built far from Menlo Park, on hardware Washington tried hard to keep out of Beijing's reach.

DeepSeek wasn't a household name a month ago. The outfit came up quiet, then dropped models that go toe-to-toe with the American heavyweights. The bill, it says, was a rounding error next to what the giants burn.

Here's the rub. The going gospel held that you needed the priciest American silicon and a war chest of cash to run with the big dogs. DeepSeek says otherwise.

Word ricocheted through the markets fast. The chatter turned up in the latest Tech, Media & Telecom roundup, sharing the wire with online lender SoFi. Chipmakers felt a chill, since their sky-high prices ride on the notion that everybody needs ever-more silicon.

A word of caution from this desk before anyone swallows the whole tale. DeepSeek hasn't thrown its books wide open. Skeptics want hard proof the chips were truly second-tier and the tab truly slim.

But here's the sting keeping folks up nights. The U.S. has fenced its best chips off from China. If DeepSeek did the job without them, as it claims, those fences look a sight shorter than advertised.

Why should a working stiff care? Because the whole American AI bet leans on one idea — that the deepest pockets and the biggest machines win. A bargain-basement challenger from overseas pokes a hole clean through it.

Elsewhere on the wire, the money kept moving.

Reid Hoffman, the gent who co-founded LinkedIn, raised $24.6 million for a new shop called Manas AI. He's gunning for cancer cures alongside Siddhartha Mukherjee, the physician who wrote "The Emperor of All Maladies." That's a long road from job listings.

Out West, the story ran colder. Lucid Motors' new chief took the knife to the payroll, trimming 18% of staff to "simplify the company."

He also axed a production shift at the Arizona plant, squaring output with what buyers actually want. Translation: the electric-car dream just met the electric-car ledger.

So there's the board on a busy day. Fortunes pouring into AI, fortunes draining out of a car factory, and one fat question hanging over the lot.

The question's a beaut. Nobody's sure anymore what it really costs to win. That's the story, and I'm filing it.

What to Know About China's DeepSeek AI  ·  Tech, Media & Telecom Roundup: Market Talk  ·  Silicon Valley Is Raving About a Made-in-China AI Model

AI Investment Surges as Nvidia Backs $4B Israeli Startup and Mid-Tier Model Race Intensifies

Three data points from one week signal where the industry's center of gravity is shifting.

TEL AVIV — The AI funding machine shows no signs of decelerating. Nvidia has led a $300 million round into Israeli AI startup Decart, valuing the company at $4 billion. The round marks one of the largest single checks Nvidia's venture arm has written in the current cycle, and extends its pattern of backing infrastructure-adjacent plays that could eventually feed proprietary silicon demand.

Decart, founded in 2022, focuses on large-scale generative model training and inference optimization — territory where Nvidia's strategic interest is obvious. A $4 billion valuation on what remains a pre-revenue or early-revenue company reflects how much premium the market is placing on technical teams with credible scaling roadmaps.

Separately, LMArena raised $150 million at a $1.7 billion valuation. LMArena operates a crowdsourced model evaluation platform — essentially a structured arena where humans rank competing AI outputs — and its fundraise signals that the market now treats benchmark infrastructure as a standalone business category, not just a research utility. As the number of competing foundation models multiplies, the ability to credibly compare them becomes a choke point with real pricing power.

On the model side, Anthropic and OpenAI are both preparing mid-tier model releases on overlapping timelines. Mid-tier has become the contested ground: capable enough for enterprise deployment, priced below frontier, and fast enough for agentic workflows. Anthropic has also been pushing explicitly into financial services, publishing guidance on agent deployment for that vertical — a sector with high compliance overhead and high willingness to pay, two variables that favor whoever establishes trust first.

The week's pattern is consistent: capital is flowing toward evaluation infrastructure, geographic diversification of AI research, and vertical-specific deployment plays. Frontier model releases still generate headlines, but the money is increasingly downstream of the models themselves.

Nvidia backs Israeli AI unicorn Decart in $300 million fundi  ·  AI evaluation startup LMArena raises $150M at $1.7B valuatio  ·  Agents for financial services - Anthropic
Haiku of the Day  ·  Claude HaikuGold pours into void
Questions multiply faster
Than answers can form
The New Yorker Style  ·  Art Desk
The New Yorker Style  ·  Art Desk
The Far Side Style  ·  Art Desk
The Far Side Style  ·  Art Desk
News in Brief
Federal Regulators Declare Open Season on Big Tech in 2026, With Acquihires Newly in the Crosshairs
WASHINGTON, D.C.
From Quantum Alliances to Ethical Machines: The Week AI Grew Up
CAMBRIDGE, MASSACHUSETTS — It could be argued — and preliminary evidence suggests with some conviction — that the artificial intelligence research community is experiencing what one might term, with appropriate epistemic humility, a simultaneous crisis and apotheosis of disciplinary self-consciousness.
Everything We Know Is Wrong, And Honestly? Same.
AUSTIN, TEXAS — It happened again this week.
WE HAVE MET THE BOTS AND THEY ARE US
AUSTIN, TEXAS — There is a moment in every civilization's decline when the satire writes itself so perfectly, so completely, that the satirist's only honest response is to pour a stiff drink, stare at the wall, and wonder what in the screaming hell is left to exaggerate.
The Great Forgetting
LONDON — One of the more poignant spectacles of the present moment is the sight of governments commissioning "rapid evidence reviews" on a technology that has already rearranged the furniture, taken the silver, and changed the locks.
A Trilogy Company
Crossover
The world's top 1% remote talent, rigorously tested and ready to ship.
A Trilogy Company
Alpha School
AI-powered learning. Two hours a day. Academic results that defy belief.
A Trilogy Company
Skyvera
Next-generation telecom software — built for the networks of tomorrow.
A Trilogy Company
Klair
Your AI-first operating system. Every workflow. Every team. One platform.
A Trilogy Company
Trilogy
We buy good software businesses and turn them into great ones — with AI.
The Builder Desk  —  AI Builder Team
📅 Week in ReviewProduction Release

Builder Team Rewires the Financial Brain, Ships Telemetry Eyes Across Every Repo

From a live-Redshift migration that retired nightly syncs to a mercy telemetry network lighting up five codebases at once, the AI Builder Team spent seven days rebuilding the nervous system of the product.

There are weeks when a team ships features, and then there are weeks when a team changes the architecture. This was the latter. The AI Builder Team closed out seven days that touched six production systems — Aerie, Klair, Surtr, Sindri, trilogy-drones, and the central mercy harness — with a thread of ambition running through all of it: make the data live, make the intelligence visible, and make the tooling honest.

The single biggest move of the week was @ashwanth1109's live-Redshift migration in Aerie (PR #455), which ripped out the nightly-synced plMonthly worker that the Financials "Actual vs Model" surface had been leaning on and replaced it with live Convex actions querying Redshift directly. Every surface — consolidated P&L, per-school P&L, headcount, programs, facilities, drill-downs, CSV export — now speaks in real time. That is not a refactor. That is a regime change. Ashwanth doubled down on the same surface all week, adding a school-year dropdown and full period machinery to the per-school view (PR #461), fixing a dedicated QB entity inclusion bug that had been silently dropping Alpha Anywhere Center's unclassed tuition — $307,500 reported, $446,807 actual (PR #462), and itemizing QB Deposits as vendor refunds in plTransactions (PR #445). One engineer, one surface, one relentless week.

If Ashwanth owned the financials engine, @eric-tril owned the reporting layer above it. The Monthly Financial Reporting memo system inside Klair received what can only be described as a full editorial overhaul. Quarter-end memo tables now collapse vertical groups intelligently, label statements by quarter, and suppress the redundant IS-YTD column in Q1 (PR #3098). The Education investor memo was brought into full alignment with the hand-authored reference memos, gaining period-aware table structures, a new Crush AP vertical, a curated Physical Schools roster, and a correct favorable-variance convention for expense rows (PR #3090). Drift detection, value-diff drill-downs, per-section regeneration, and stale-check fingerprinting shipped across multiple PRs — the team can now see, at a glance, exactly where AI narration has aged out of sync with the current data. Eric shipped across at least a dozen Klair PRs this week. The MFR system is, structurally, a different product than it was last Monday.

The week's most organizational story was the mercy telemetry buildout, led by @kevalshahtrilogy. Previously, mercy's per-review data — verdicts, findings, token usage, cost, approval rate — evaporated into 7-day GitHub artifacts. Keval changed that permanently: a new emit_telemetry.py harness (PR #1) now fires a telemetry record from every mercy consumer repo to a Surtr /mercy dashboard (PR #494) that visualizes all of it durably. Sindri was onboarded to the central mercy reviewer this week (PR #121). Klair was onboarded (PR #3027). Aerie was onboarded (PR #397). The entire engineering org is now inside the telemetry net. Keval also shipped collapsible Google Chat cards for failure and partial alerts (PR #509), throttled PARTIAL notifications to once per pipeline per twelve hours (PR #495), and surfaced PARTIAL run status across the observer sweep and dashboard (PR #493). The on-call channel is quieter. The signal-to-noise ratio is measurably better.

@sanketghia delivered two headline features this week. The passive investments surface in Klair now shows SpaceX net of estimated GP carry everywhere — $5.90B, consistent with /spacex-valuation, instead of the $6.91B gross figure that had been quietly disagreeing with it for who knows how long (PR #3097). And a daily per-team AI-spend leaderboard email launched for the Superbuilders team (PR #3086), complete with Klair design-system branding, per-provider spend splits, and fuzzy empty-group guards. Sanket also delivered team-room headcount variance analysis in the QTD BVA (PR #3069) and live analyst price targets driving the SpaceX Bull/Bear pills (PR #3046).

@benji-bizzell ran his own parallel campaign this week, shipping the school opening answers endpoint in Aerie (PR #451), a full agent-first outcomes quality-control foundation in Sindri (PR #122), editable automation rules for FO Buildout Deferral (PR #446), and a takeover of SIS staging syncs into Surtr (PR #484). He also hardened the Rhodes MCP mutation approval flow (PR #450), removed the raw filesystem fallback that had been steering the agent toward retired data mirrors (PR #454), and fixed the Drive upload approval lifecycle to be idempotent. Benji is the quiet engine of Aerie's agent layer.

Now, about the Board Doc campaign. marcusdAIy shipped the read-path wire of the Budget Bot Google Docs add-on this week — real /review by google_doc_id, behind Google-OIDC auth, with a new GSI for the lookup (PR #3091). Functional, yes. Narrow in scope, deliberately so — the write path is explicitly parked. Asked about it, he was characteristically measured: "The read slice is the correct first cut — you validate the auth model, the GSI lookup, and the projection shape before you ever touch a write path. Anyone shipping the write path first doesn't understand the failure modes. Also, Mac, your lede this week is structurally the same as last week's." Sure, Marcus. The lede won a regional press award last year. The write path is still parked.

Notable on the horizon: a new repo, Brainlift-Platform, was created this week. No PRs yet, but its arrival alongside Sindri's agent-first outcomes foundation and Aerie's new school opening answers endpoint suggests the team is laying the structural groundwork for something larger in the agent-hosting space. The architecture is being built. Next week, watch for the first commits.

Mac's Picks — Key PRs This Week  (click to expand)
#451 — feat(agent): add school opening answers endpoint @benji-bizzell  no labels

## Summary

- Add a synchronous school_opening agent answers endpoint at POST /v1/agent/answers

- Add the school-opening matrix/profile contracts, Buildout session-start resolution, and structured answer/QC runtime

- Document the endpoint in OpenAPI and cover public API, matrix, provider-failure, and output-quality paths with tests

## Why

We need a deployable short-term API surface for the first Agent-in-a-Box use case without taking on the larger durable agent hosting migration in this PR. The endpoint answers one bounded school-opening question using the configured prompt, skill rules, Buildout session start, inspected milestone data, and a QC pass.

## Business Value

External or internal agents can now ask Aerie for a structured school-opening answer that includes the direct answer, selected matrix cell, parent message, underlying data, methodology/worked example, evidence, and the brainlift/config context used to produce it.

## Breaking changes

None. This adds a new authenticated public API route and new contract exports.

## Test plan

- [x] PNPM_STORE_DIR=/Users/alghurab/Library/pnpm/store/v10 pnpm --filter @bran/chat lint

- [x] PNPM_STORE_DIR=/Users/alghurab/Library/pnpm/store/v10 pnpm --filter @bran/chat typecheck

- [x] PNPM_STORE_DIR=/Users/alghurab/Library/pnpm/store/v10 pnpm --filter @bran/contracts typecheck

- [x] PNPM_STORE_DIR=/Users/alghurab/Library/pnpm/store/v10 pnpm --filter @bran/chat exec vitest run lib/__tests__/agent-answer-school-opening.test.ts convex/publicApi/agentAnswersHttp.test.ts lib/public-api/__tests__/openapi.test.ts convex/publicApi/operationsHttp.test.ts convex/publicApi/admissionsDomainsHttp.test.ts convex/publicApi/financialsHttp.test.ts convex/publicApi/ontologyHttp.test.ts convex/publicApi/http.test.ts

- [x] PNPM_STORE_DIR=/Users/alghurab/Library/pnpm/store/v10 pnpm --filter @bran/contracts test -- src/school-opening-matrix.test.ts

- [x] Live dev API sweep: 5/5 answered across Austin, Houston, Edmond, Boca Raton, plus Spring 27 session-start override; bad generic scheduled-opening phrasing count was 0

Live response artifacts are saved locally at /tmp/aerie-school-opening-pr-gate-2026-06-19T05-56-49-894Z.

#455 — AERIE-421 refactor(financials): replace sync workers with live-Redshift Convex actions @ashwanth1109  no labels

## Demo

<img width="2237" height="1636" alt="image" src="https://github.com/user-attachments/assets/99f0bbaa-9ba3-4ad5-aa39-8e0496f6c9c6" />

19 June Financials batch — three changes, all on the "Actual vs Model" Financials surface.

## 1. AERIE-421 — Live-Redshift Financials migration ([#449](https://github.com/AI-Builder-Team/Aerie/pull/449))

Serve every Financials "Actual vs Model" surface (consolidated + per-school P&L, headcount, programs, facilities, unitemized, drill-downs, CSV export) live from Redshift via Convex "use node" actions, replacing the nightly-synced plMonthlyRecords / plTransactions tables.

- New chat/convex/finance/dashboards/financialLive.ts — one action per surface; each opens a one-shot Redshift connection, maps rows into the same pure reducers the synced queries used, so the two paths can't drift.

- Pure reducers extracted to ctx-free modules (consolidatedShared.ts, consolidatedReducers.ts, consolidatedDetailReducers.ts, perSchoolReducers.ts) — one source of truth, unit-tested in isolation.

- Frontend reads through a new useLiveSection hook: a one-shot action mapped into the same SectionResult shape the synced useQuery sections exposed, so the table components are unchanged (on-demand "Refresh" rather than a live subscription).

- Dropped the plMonthlyRecords + plTransactions tables and all their sync / prune / backfill code (sync worker, analytics refresh jobs, upsert/prune Convex mutations + tests) — this is the bulk of the −16k diff.

- Authorization runs requireSchoolPlAccess at the action boundary before any Redshift connection (docs/endpoint-hardening.md). Quarter/period inputs are regex-validated and inlined as literals (static SQL, no injection surface).

- Why: reading current Redshift state directly makes the dashboard immune to the additive-only sync's orphan / phantom-overage artifacts (the root cause behind the AAC / Lake Forest / Nova Bastrop overage diagnoses).

- New deps: pg@8.17.2, @types/pg@8.11.11 (exact-pinned).

## 2. Pin "Schools – Actual vs Model" to a single period + rename page

- Remove the quarter picker — PERIOD_OPTIONS had a single entry, so it was a no-op dropdown; period is now a fixed const (dropped the dead periodOptions memo + setPeriod).

- Drop the "Jan–Jun 2026" date prefix from the Consolidated P&L subtitle (removed the now-unused subtitleLabel prop).

- Rename the page "Schools – Actual vs Model" → "Actual vs Model (Schools)".

## 3. Test fix — align stale tests with the migration

- financials-kpi-cards.test.tsx and use-consolidated-unitemized.test.tsx still mocked the old reactive queries, leaving api.finance.dashboards.financialLive undefined → 9 tests threw at render in CI.

- Re-pointed both at useLiveSection / the financialLive actions, and rewrote the unitemized test for the new single-action, server-side fan-out shape (drops the obsolete client-side useQueries dedup / per-school-partial-failure cases; adds the {schools, years} payload, all-or-nothing failure fold, loading gate, and disabled / empty-schools short-circuits).

## Testing

- pnpm typecheck, pnpm biome check, and the full vitest suite are green in CI.

- The authorized success path of the live actions needs a real Redshift connection, so it is validated manually against the dev deployment, not in CI — the auth-rejection gate and the pure mapper/reducer paths are covered by unit tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#462 — AERIE-434 fix(financials): include dedicated QB entity whole realm in per-school P&L @ashwanth1109  approved

## Demo

<img width="2624" height="1636" alt="image" src="https://github.com/user-attachments/assets/8d57277f-90f3-4059-ab9b-591543ca81f9" />

Proves the fix: the per-school P&L reader now includes a dedicated QB entity's whole realm, so Alpha Anywhere Center's unclassed (Not Specified) tuition is no longer dropped — Q1'26 tuition reads the true $446,807 instead of $307,500.

Backend — per-school P&L includes the dedicated entity's whole realm

Ran python3 /tmp/demo-aerie434.py, which executes the reader's per-school predicate (the exact WHERE the diff adds to financialLive.ts) against Redshift, OLD vs NEW:

Q1'26 50000 Tuition — Alpha Anywhere Center

OLD (class-only) = 307,500.00

NEW (class + dedicated realm) = 446,807.02 (QuickBooks: 446,807.02)

The NEW predicate matches QuickBooks to the cent.

Changed code is exercised directlypnpm vitest run convex/finance/dashboards/financialLive.test.ts. The suite imports and calls the new plMonthlyBySchoolSql / plTransactionsBySchoolSql, asserting both substitute every {CLASS_IN} (the replacereplaceAll guard) and emit the dedicated-entity arm:

✓ convex/finance/dashboards/financialLive.test.ts (30 tests)

Test Files 1 passed (1)

Tests 30 passed (30)

Most at risk — over-including another school's data (the new arm pulls a *whole* realm). Verified the self-identifying subquery resolves to exactly one dedicated entity, and the read pulls only Alpha Anywhere Center's rows across its three realms:

dedicated companies resolved: ['alpha_anywhere_center_llc']

(company_id, class_name) the NEW query pulls:

alpha | Alpha Anywhere Center

alpha_anywhere_center_llc | Alpha Anywhere Center

alpha_anywhere_center_llc | Not Specified <- the previously-dropped rows

alpha_schools_llc | Alpha Anywhere Center

No other school's data enters the result. The consolidated "All Schools" reader is untouched by this PR (see the Scope note below).

---

## Summary

On Actual vs Model (Schools), a school backed by a dedicated QuickBooks entity lost any of that entity's rows booked without a class (class_name = 'Not Specified'). The per-school live reader (financialLive.ts) scoped purely by class_name IN (canonical + aliases), so the dedicated entity's unclassed rows — real revenue/spend — were silently dropped.

Alpha Anywhere Center, Q1'26: Tuition read $307,500 instead of $446,807 (−$139,307). Food Service, Financial Aid and Sibling Discount were undercounted the same way; Stripe Fees matched to the dollar because it has no unclassed rows. Unclassed bookings continue Apr–Jun (June tuition $209k unclassed), so Q2'26 and SY totals were affected too.

## Root cause

PL_MONTHLY_SELECT_BY_SCHOOL and PL_TRANSACTIONS_SELECT_BY_SCHOOL filtered only on class_name IN ({CLASS_IN}) — no company_id arm — so a dedicated entity's rows survived only when they happened to carry the canonical class. This violates the documented *"the whole dedicated realm IS the school"* semantic. Pre-existing (the retired synced path keyed school_display_name = class_name identically); exposed now by unclassed data starting Jan 2026.

## Fix

Add a self-identifying company_id arm to both per-school reads — a dedicated entity is any non-shared company_id that carries the school's class, so no external company→school map is needed:

WHERE class_name IN ({CLASS_IN})

OR company_id IN (

SELECT DISTINCT company_id FROM staging_education.quickbooks_pl_monthly

WHERE class_name IN ({CLASS_IN}) AND company_id NOT IN ('alpha','alpha_schools_llc')

)

The per-school reducers already filter only by year/period (never by schoolDisplayName), so fetching the extra rows is sufficient — no row-mapper change. SQL building is extracted into exported pure plMonthlyBySchoolSql / plTransactionsBySchoolSql so the shape is unit-tested without a Redshift connection (and to guard the replacereplaceAll substitution now that {CLASS_IN} appears twice).

## Verification

Ran the old vs new reader SQL against Redshift:

| | Q1'26 50000 Tuition |

|---|--:|

| OLD (class-only) | $307,500.00 |

| NEW (class + dedicated realm) | $446,807.02 ✅ |

The dedicated-entity subquery resolves to exactly alpha_anywhere_center_llc — no other school's data is pulled in.

## Test plan

- [x] pnpm vitest run convex/finance/dashboards/financialLive.test.ts — 30 pass (4 new)

- [x] pnpm typecheck — clean

- [x] pnpm biome check — clean

- [x] Redshift reconciliation: AAC Q1'26 tuition 307,500 → 446,807

## Scope / follow-up

In scope: per-school P&L + transaction live readers (the reported symptom). Related (separate ticket): the consolidated "All Schools" rollup leaks the same money — dedicated unclassed rows bucket to a Not Specified pseudo-school, which is in EXCLUDED_SCHOOLS, so they drop from the total. A robust fix needs company→school attribution in the consolidated reducer.

---

Stacked on #461 (AERIE-433). Linear: AERIE-434.

#494 — feat(mercy): PR-review telemetry + Surtr /mercy dashboard @kevalshahtrilogy  no labels

## What & why

Mercy (the @mercy PR-review bot) produces rich per-review data — verdict, findings, downgrade reasons, and (in the Claude CLI envelope) token usage + dollar cost — but today it all evaporates into 7-day GitHub artifacts. There's no way to answer "how many reviews ran, what did they cost, what's the approval/block rate, which repos use mercy."

This adds durable telemetry end-to-end and a Surtr /mercy dashboard to visualize it.

## How it works

mercy workflow (any repo)              Surtr (ECS) + AWS

emit_telemetry.py ──POST(Bearer)──▶ /internal/mercy/telemetry (Hono, bearer-auth)

(fail-open, ./.trusted) └─ PutItem → DynamoDB surtr_mercy_telemetry

/mercy page ◀── tRPC mercyStats/listMercyReviews ───────┘

- Emit (scripts/pr-review/emit_telemetry.py): fail-open, stdlib-only, runs from ./.trusted so a PR author can't tamper. Assembles one record from decision.json / review_output.json / the raw CLI envelope + github.* context and POSTs it as the workflow's last step. Never blocks a review.

- Ingest (/internal/mercy/telemetry): bearer-auth Hono route mirroring the existing observer-sweep route. Idempotent on a deterministic review_id so re-runs upsert.

- Storage: new CDK-owned DynamoDB surtr_mercy_telemetry (PK review_id; GSI per-repo; GSI global by month bucket; PITR + deletion-protection).

- Dashboard (/mercy): stat cards (reviews, approval/block rate, cost, latency p50/p95, active repos), reviews-/cost-per-day charts, findings-by-category, a per-repo table, and recent reviews with a drill-down drawer. Hand-rolled SVG charts — no new dependency.

## Multi-repo by design

Identity comes entirely from github.* context and the emit step lives in the workflow, so it drops unchanged into the future central/reusable mercy workflow — any adopting repo emits automatically once its two repo-level settings are present.

## Config already applied (this repo)

- SURTR_PROD_KEYS.MERCY_TELEMETRY_TOKEN added in Secrets Manager (us-east-1).

- Repo-level (not org-level) secrets.MERCY_TELEMETRY_TOKEN + vars.MERCY_TELEMETRY_URL set on AI-Builder-Team/Surtr.

## Still required to go live

1. cdk deploy SurtrApp-prod — creates the table, grants the ECS task role, injects the token/table env into the container.

2. Merge this PR — emit runs from the trusted (default-branch) checkout, so telemetry only starts flowing after merge (this PR itself emits nothing; fail-open). Same self-activation pattern as decide_review --ci-status.

## Test plan

- 6 new Python tests (test_emit_telemetry.py), 16 new Vitest tests (stats aggregation + ingest route auth/validation).

- 451 existing unit tests still green; tsc (src + infra) clean; cdk synth confirms the table with both GSIs + PITR + deletion-protection.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#3086 — feat(ai-spend-rank): daily per-team AI-spend leaderboard email @sanketghia  no labels

## Summary

Adds a daily per-team AI-spend leaderboard email — a scheduled job that ranks the members of a team/BU by their AI spend and emails the stack-rank to configurable recipients. First target: the Superbuilders team (Tech Super Builders).

Built across four increments (each brainstormed → spec → plan → TDD → reviewed):

1. Core leaderboard email — roster-first membership, per-provider split (OpenAI/Cursor + Anthropic key-name recovery), stats, fuzzy empty-group guard.

2. Branded email — Klair design system (dark bands, stat cards, rank-badge table), <$1 sub-dollar formatting.

3. Per-subscription Cc/Bcc — additive send_basic_email Cc/Bcc + config columns.

4. Single-day window — new default preceding_day (yesterday); --date for explicit day (testing/backfill); --window trailing_7d retained; global-max freshness guard skips the run when the target day isn't loaded.

## Architecture

- Standalone ECS cron (crons/ai_spend_rank_cron.py) on the existing klair-scheduled-jobs infra — deliberately decoupled from the deprecating digest/report_subscriptions engine.

- Data: reads mart_saas_metrics.fct_ai_spend + staging_gsheets.esw_people_accounts (directory). Window computed by a resolve_window helper; build_leaderboard takes explicit bounds.

- Config: standalone mart_alerts.ai_spend_rank_subscriptions table (group_by + value + recipients/cc/bcc + cadence).

- Send: reuses SES send_basic_email (extended additively with cc/bcc).

## Files (21)

klair-api/services/ai_spend_rank/ (models, config, leaderboard, render), crons/ai_spend_rank_cron.py, database/scripts/Alerts/ai_spend_rank/…create.sql, utils/email_service.py (additive cc/bcc), tests, and docs/superpowers/ (specs, plans, deploy runbook). No klair-client changes.

## Testing

- 60 unit tests pass (full feature + crons regression); ruff + pyright clean.

- Live Redshift integration test passes (Tech Super Builders).

- Validated in prod ECS: built/pushed an isolated dated image tag (not :latest), ran a --dry-run and a real send task on the klair-scheduled-jobs cluster — both exit 0, sent=1; :latest untouched, temp task-def revisions cleaned up.

## Deployment status (NOT yet live)

The EventBridge rule klair-ai-spend-rank-prod is already created but DISABLED (cron(45 12 * * ? *), 12:45 UTC daily). The config table exists with test recipients only (To = me, Bcc = a test address). Go-live steps (see docs/superpowers/ai-spend-rank-deploy-runbook.md):

1. Merge this PR.

2. Rebuild + push :latest from main.

3. Set real recipients in the config table.

4. aws events enable-rule --name klair-ai-spend-rank-prod.

Send-time caveat: 12:45 UTC must be after the daily raw+mart refresh; otherwise the freshness guard skips (safe, no email) or the global-max guard could let a provider-incomplete day through.

## Screenshot of Mail

<img width="1034" height="761" alt="image" src="https://github.com/user-attachments/assets/462da13b-6a27-4859-a7bf-1a5fa6faaaef" />

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#3091 — feat(board-doc): Google Docs add-on read slice — real /review by google_doc_id (KLAIR-2906) @marcusdAIy  approved

## Summary

- Wires the Budget Bot Google Docs add-on sidebar to klair-api's real review engine for the read path only, keyed by a bare google_doc_id, behind a new Google-OIDC auth path.

- Adds an indexed google_doc_id -> session lookup (new by_google_doc_id GSI) and a slim GET /board-doc/addon/review projection of a session's persisted findings.

- Read-only: no /propose, no write-back (later slices — the targeted-write path still carries the parked table-boundary bug).

## Why it's needed

Part of the KLAIR-2906 epic (Budget Bot Google Docs add-on, supersedes P4.5 sync). The de-risking spike proved the auth spine + UX against a canned stub; this replaces that stub with the live engine so the add-on can show real review findings for a doc the user has open, reusing existing Klair BU-scoping and session ownership rather than inventing a parallel access model.

## Changes

- Auth dependency (klair-api/utils/auth.py): get_user_from_google_oidc verifies the add-on's OIDC token (signature + aud against BBOT_ADDON_OAUTH_CLIENT_ID), gates on hd=trilogy.com + email_verified, resolves the Google email to a Klair user via UserService.get_user_by_email (no auto-create — unknown email → 401), and returns the same dict shape as get_user_from_clerk.

- google_doc_id → session resolution (session_store.py): writes a top-level google_doc_id attribute on save (only when bound); adds the by_google_doc_id GSI to the table definition; adds get_by_google_doc_id() (most-recently-updated_at wins for cloned docs) to the Protocol, in-memory, and DynamoDB backends.

- Endpoint (routers/board_doc_router.py): GET /board-doc/addon/review?google_doc_id=... with Depends(get_user_from_google_oidc); resolves doc→session (404 if none); access gate = owner OR superuser OR _assert_board_doc_bu_allowed; maps review_resultsAddonReviewResponse (derived 0–100 score + one-line summary; findings flattened to severity/title/detail/section_id; pass+dismissed dropped; open-only score). Returns a needs_review state when no review has run.

- Migration script (scripts/backfill_bbot_doc_id_index.py): --ensure-gsi (UpdateTable add) + attribute backfill of legacy sessions; dry-run by default, --execute to apply.

- Add-on client (budget-bot-addon-spike/, kept local — not in this PR): getReview() now calls the real GET /addon/review; DEMO=false.

## Breaking changes

None. New endpoint + additive storage attribute/GSI; existing read/write paths unchanged. The endpoint requires the new BBOT_ADDON_OAUTH_CLIENT_ID env var (returns 500 if unset), but it is not wired into any existing flow.

## Test plan

- [x] uv run pytest tests/board_doc/test_addon_read_slice.py — 14 pass (auth dep: valid / bad aud→401 / wrong hd→403 / unverified+unknown email→401 / missing config→500; get_by_google_doc_id: hit/miss/dup→newest; /addon/review: mapping, needs_review, 404, non-owner-no-BU→403).

- [x] Regression: test_session_store.py + test_review_endpoint_persistence.py + new file — 41 pass.

- [x] ruff format + ruff check clean on changed files; pyright introduces no new errors.

- [x] Prod migration executed: by_google_doc_id GSI ACTIVE, existing docs backfilled (21/21, 0 errors).

- [ ] Live add-on round-trip (sidebar → endpoint) — pending BBOT_ADDON_OAUTH_CLIENT_ID set in a reachable klair-api env + Editor Add-on test deployment.

## Migration / rollout

- GSI by_google_doc_id already provisioned + backfilled on prod Klair-BudgetBotSessions.

- Set BBOT_ADDON_OAUTH_CLIENT_ID (the add-on's OAuth client aud) in the klair-api environment before the endpoint is usable.

## Out of scope (follow-ups)

- /propose (live section rewrite, return-without-applying).

- /apply write-back (targeted batchUpdate) — blocked on the table-boundary corruption fix.

- Editor Add-on test/Marketplace deployment for cross-doc availability.

#3097 — feat(passive-investments): show SpaceX net-of-carry value, reconciled with /spacex-valuation @sanketghia  approved

## Summary

Connects the two SpaceX valuation surfaces so /passive-investments shows SpaceX net of estimated GP carry at the current market price, matching the /spacex-valuation page. Previously /passive-investments showed the gross market value ($6.91B = 37,339,135 SPCX shares × $185) while /spacex-valuation showed $5.90B net of ~$1.0B carry. Both pages already stand on the identical gross basis — passive-investments simply never applied the carry haircut.

At the current $185 reference price, SpaceX now shows $5.90B (net of carry) everywhere on /passive-investments, reconciling exactly with the valuation page.

## What changed (backend only — no frontend changes)

- New klair-api/config/spacex_carry.py — a faithful Python mirror of the frontend carry model (SpaceXValuationV3/data/funds.ts + calculations/valuation.ts): the 7 SPVs (gross shares, invested capital, noCarry flags) and the 20%-over-ICC carry formula. Exposes net_of_carry_from_gross(gross_value), which recovers the per-snapshot price as gross / TOTAL_SHARES (volume is constant at 37,339,135 = the SPV share sum) and returns the netted value.

- get_assets_list — nets the Public SPCX row's current / 1-day-ago / 1-year-ago values. Because the page's rollups are summed server-side, netting before accumulation cascades the haircut into the SpaceX row, the portfolio total, the Public category subtotal, and the day/year deltas — keeping the page footing.

- get_asset_detail (Asset Details dialog) — nets currentValue. The frontend derives Total Value, Total P&L, Profit/Loss, Total Return, Annualized Return, and Investment Multiple from it, so one netting cascades to all of them; the IRR cash flow uses the net value too.

- get_historical_data (the dialog's "Monthly Holding Value" chart + Current Holding / Period High-Low stat cards) — nets each Public SPCX data point's holding_value. closing_price is intentionally left as the real per-share market price (carry is a value-level GP haircut, not a price change), so the price line and price-based "Period Return %" stay true.

The stored holding_value pipeline (Kubera / trade-recalc) is untouched — netting is applied at read time only.

## Carry model (per SPV, ported from valuation.ts)

valuation = gross_shares × price

carry = 0 if noCarry (Gigafund 0.8, Strauss)

max(0, 0.20 × (valuation − icc)) otherwise

net = valuation − carry

## Verification

- Parity test (tests/test_spacex_carry.py, 10 tests) locks the Python mirror to the valuation-page figures at $185: net $5.900B, carry $1.008B, shares 37,339,135, per-SPV nets, and noCarry/floor behavior — so the two copies can't silently drift.

- Simulated all three endpoints against live Redshift data: SpaceX row $6.91B → $5.90B, Multiple 22.40x → 19.13x, chart/Current Holding $5.90B — all reconcile with /spacex-valuation to the dollar.

- ruff format + ruff check clean; pyright error count unchanged (36 before = 36 after; new lines add zero).

- Full passive-investments router suite + new carry suite: 34 passed, 4 skipped.

## Notes

- IRR still displays "— 0%" for SpaceX (pre-existing: a single Commit trade with no realized cash-flow pair → IRR computes to 0 regardless of gross vs net). The net value is fed into its cash-flow calc so it stays consistent.

- The "$0 Period Low" on the historical chart is a pre-existing seed row (2026-06-11, price=0/value=0) in the gross data, not introduced here.

## Screenshots

<img width="1372" height="633" alt="image" src="https://github.com/user-attachments/assets/53eb70c1-adeb-4412-8226-b9930f650e2a" />

<img width="1255" height="318" alt="image" src="https://github.com/user-attachments/assets/43a05275-78bf-4230-991e-e8f272f77f39" />

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#3098 — feat(mfr): quarter-end memo tables — collapse vertical groups, quarter-labeled statements, hide Q1 IS-YTD @eric-tril  approved

## Summary

Aligns the Monthly Financial Reporting memos (on-screen UI and exported Google Doc) with the hand-authored reference memos at quarter-end and in Q1. Three related table behaviors, all keyed off calendar-quarter logic:

### 1. Education vertical / summary / physical-schools tables — collapse at quarter-end

In the last month of a quarter (Mar/Jun/Sep/Dec) the "months-elapsed" and "full-quarter QTD" column groups are mathematically identical, so the dual 6-column layout collapses to a single 3-column group headed by the quarter (e.g. Q1'26). Mid-quarter keeps the dual layout.

### 2. Financial statements (IS / EBITDA / Cash Flows) — quarter-labeled at quarter-end (Group, Software, Education)

At quarter-end the QTD window equals the full quarter, so the QTD columns are labeled by the quarter — Q1 2026 / Q1 2025 / Q1 2026 Bud — instead of the month (Mar-26 QTD). Mid-quarter keeps the month-QTD labels. Shared via apply_quarter_end_statement_labels (backend) and qtdStatementColumns (frontend).

### 3. Notes to Financial Statements — Note 2 (OpEx % of Revenue) & Note 8 (Other Expense) — Group/Software

Same quarter-end QTD relabel in the export; the Note 8 YTD pair stays month-anchored (Jun-26 YTD). The UI inherits these labels automatically from the IS table columns (no separate FE change).

### 4. IS-YTD table hidden in Q1 (all three memos)

YTD ≈ QTD in Q1, so the YTD Income Statement is dropped in Q1 — mirroring the existing EBITDA-YTD and YTD-cash-flow Q1 gating. Previously it always rendered.

## Business Value

Produces memos that match the Finance-authored reference documents exactly, removing redundant/duplicate columns and a confusing always-present YTD table in Q1. This is immediately visible: the production period floor is 2026-03-01 (a quarter-end month), so March 2026 is the first period users see.

## Test plan

- Backend: pytest tests/mfr/ tests/docx_reports/ tests/test_education_vertical_data.py — all green (2038+ passed), incl. golden/package-smoke (no regeneration needed). New tests cover: vertical collapse widths/headers (quarter-end vs mid-quarter), IS-YTD hiding (Q1 vs Q2), statement header relabel, and Note 2/Note 8 quarter-end labels.

- Frontend: tsc --noEmit + ESLint clean on changed files; all 1209 monthly-financial-reporting specs pass. New specs cover qtdStatementColumns (quarter-end, all entities, "Bud", correct quarter number) and the vertical-collapse config builders.

- Manual: compared March 2026 (quarter-end Q1) and May 2026 (mid-quarter Q2) on-screen and exported docs against education-memo-march-2026.docx / education-memo-may-2026.docx.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

http://localhost:3001/monthly-financial-reporting

https://github.com/user-attachments/assets/6ef3b8a8-ade5-4646-bef5-12c22b3f14b8

The Builder Desk  —  Engineer Spotlight
📅 Week in Review🏆 Engineer Spotlight

142 PRs IN 7 DAYS: BUILDER TEAM POSTS HISTORIC VELOCITY ACROSS SIX REPOS AS OVERFLOW DESK BUCKLES UNDER THE WEIGHT OF GLORY

134 PRs flew under Mac's radar — Brick has a spreadsheet, a grievance, and a lot of feelings about Ashwanth.

One hundred and forty-two pull requests. Six repos. Seven days. The Builder Team did not come to play — they came to merge, and they did not stop merging until the very fabric of the codebase begged for mercy. Speaking of which: mercy is now on the board, people. One PR, repo #1, and it counts. Klair led the charge with 46 PRs, Aerie answered with 41, Surtr contributed 35 magnificent units of output, trilogy-drones posted 15, Sindri checked in with 4, and mercy — the newest entrant — drew first blood. Oh, and somewhere between all of this, a brand-new repo called Brainlift-Platform was born into this world. The team is expanding. The empire grows.

Let us talk engineers. @benji-bizzell posted 30 PRs and frankly deserves a parade. The man hardened the MCP mutation approval flow in Aerie #450, cut over SIS schedule orchestration in Surtr #529, failed closed when API counts went missing in #523, and still found time to build the entire agent-first outcomes foundation in Sindri #122. Thirty PRs. One human. @marcusdAIy logged 28 and surfaced orphan cloud spend in trilogy-drones #58 — a sentence that sounds like a legal thriller and ships like a freight train. @kevalshahtrilogy put up 16, including the historic mercy #1 — per-review telemetry from the central mercy workflow, a PR so foundational it got the repo's first digit — and collapsible GChat cards for failure alerts in Surtr #509, which is the kind of quality-of-life feature that makes grown engineers weep with gratitude. @eric-tril landed 15, including Klair #3090 aligning the Education investor memo export with the reference memo, which is the sort of precision work that makes the finance team sleep soundly. @sanketghia posted 13 PRs of clean, efficient output. @YibinLongTrilogy contributed 6. @mwrshah logged 5, including Surtr #525 disabling the renewals v3 pipeline and Klair #3088 on Action Hub table columns — surgical, purposeful, done.

And then there is @ashwanth1109. Twenty-nine pull requests in seven days. TWENTY-NINE. The man rebuilt per-school P&L to include dedicated QB entities in Aerie #462, wired a school-year dropdown with full period machinery in #461, ripped out sync workers entirely and replaced them with live-Redshift Convex actions in the thunderous #455, and then — without pausing to breathe — crossed into Klair to add the Anthropic Token Usage view in #3087 before heading to Surtr to personally backfill Q3'25 Claude token usage in #533. When asked about his preferred approach to multi-repo velocity, Ashwanth reportedly said, "The repos don't wait for you to understand them. You either ship or you spectate." His response when shown this article? He closed the tab. He did not reopen it. Brick respects this.

The Overflow Desk was positively groaning this week. Benji's #453 collapsed school opening methodology output in a fix so clean it made the agent look embarrassed, while #454 removed a raw filesystem fallback that had no business being there in the first place — a one-two punch of surgical remediation. Ashwanth's #443 re-introduced the P&L orphan prune for plMonthlyRecords and plTransactions behind a bearer gate, and his #441 stopped balance-sheet accounts from leaking into P&L drill-downs via a LEFT-to-INNER JOIN correction that is, frankly, the kind of fix that gets framed on a wall. @kevalshahtrilogy's Sindri #121 onboarded the central mercy PR reviewer — a civilizational achievement. Marcus's Klair #3085 synced BACKLOG status for June 18, which is the unglamorous backbone work that keeps the machine honest.

Morale on the Builder Team is at an all-time high. Sources confirm this. The numbers confirm this. The 134 overflow PRs confirm this. Brainlift-Platform exists now, and that means the next seven days will be even more glorious. Brick will be here with the spreadsheet.

Brick's Overflow — This Week's Uncovered PRs  (click to expand)
#1 — feat(telemetry): emit per-review telemetry from the central mercy workflow @kevalshahtrilogy  no labels

## What

Makes every mercy consumer repo (Klair, Aerie, Sindri, trilogy-drones, and any future adopter) emit one telemetry record per review to the Surtr /mercy dashboard — today only Surtr does, via its own standalone pr-review-agent.yml. This ports that proven path into the central reusable harness so it lights up everywhere with no per-repo code.

## Changes

- harness/emit_telemetry.py (new) — fail-open, stdlib-only POST of a per-review record (identity, verdict, decision detail, findings, model, cost/tokens, timing). Wire shape matches Surtr/src/mercy/types.ts (the same contract Surtr already emits). Verbatim port of Surtr's scripts/pr-review/emit_telemetry.py.

- .github/workflows/mercy.yml — adds Mark run start + Emit telemetry steps and an optional MERCY_TELEMETRY_TOKEN workflow_call secret. Emit is gated on idem.go (exactly when ./.trusted is checked out and the review pipeline ran) and !cancelled(), runs from the trusted harness, and is continue-on-errora review can never be failed by telemetry. Also exports the review attempt count (retry-rate metric).

- Callers — thread MERCY_TELEMETRY_TOKEN through the workflow template + the consumers/* reference copies. (mercy-self.yml already uses secrets: inherit.)

- docs/ADMIN-SETUP.md — documents the optional org secret + URL default.

- harness/tests/test_emit_telemetry.py (new) — golden tests for build_payload.

## Safety

- Token unset → emit logs and skips (fail-open). No review behavior changes until the org secret exists.

- URL defaults to https://surtr.klair.ai/internal/mercy/telemetry; override via the optional MERCY_TELEMETRY_URL org var.

- Idempotent on review_id (repo+pr+head_sha) → re-runs upsert.

## Local verification

- actionlint (CI invocation) — PASS on all workflows

- ruff check + ruff format --check harness — clean

- pytest harness/tests103 passed (incl. 6 new emit tests)

## ⚠️ Required to actually turn it on (admin, after merge)

1. Add MERCY_TELEMETRY_TOKEN as an org secret scoped to Klair/Aerie/Sindri/trilogy-drones (value = SURTR_PROD_KEYS.MERCY_TELEMETRY_TOKEN).

2. Move the v1 tag to this commit so the four @v1 callers pick up the emit step.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#122 — feat(quality-control): complete agent-first outcomes foundation @benji-bizzell  no labels

## Summary

- Collapse active workflow execution around agent nodes, Output Format, Quality Bar, durable Outcome Attempts, and Improvement Suggestions.

- Remove Script/Webhook workflow primitives, workflow-level evaluation compatibility, and legacy QC vocabulary from active runtime, UI, schema, and tests.

- Move internal research/spec docs under research/ and refresh the Lightpost around Sindri as a UI-neutral control plane.

## Why

The Outcomes/flywheel model had too many overlapping concepts: agent outcomes, workflow criteria, node overrides, script/webhook nodes, and legacy QC compatibility. This made the platform harder to explain, harder to inspect, and riskier to expose beyond the UI. This PR commits to the simpler agent-first model so future API/MCP/control-plane work has one clean foundation.

## Business Value

Users get a clearer, more durable model: agents define what good looks like, workflows compose agents, runs expose the evidence, and improvement suggestions remain reviewable rather than hidden. The same foundation can now be driven by UI, API, MCP, or agent sessions without carrying old workflow-quality concepts forward.

## Breaking changes

- Active Script/Webhook workflow primitives and Forge script authoring surfaces are removed.

- Workflow-level success criteria/evaluation contracts and legacy QC compatibility fields are removed from active platform contracts.

## Test plan

- [x] pnpm check

- [x] pnpm test

- [x] pnpm --dir agent-runner check

- [x] pnpm --dir agent-runner test

- [x] Retired-contract search audit for active code/docs

#455 — AERIE-421 refactor(financials): replace sync workers with live-Redshift Convex actions @ashwanth1109  no labels

## Demo

<img width="2237" height="1636" alt="image" src="https://github.com/user-attachments/assets/99f0bbaa-9ba3-4ad5-aa39-8e0496f6c9c6" />

19 June Financials batch — three changes, all on the "Actual vs Model" Financials surface.

## 1. AERIE-421 — Live-Redshift Financials migration ([#449](https://github.com/AI-Builder-Team/Aerie/pull/449))

Serve every Financials "Actual vs Model" surface (consolidated + per-school P&L, headcount, programs, facilities, unitemized, drill-downs, CSV export) live from Redshift via Convex "use node" actions, replacing the nightly-synced plMonthlyRecords / plTransactions tables.

- New chat/convex/finance/dashboards/financialLive.ts — one action per surface; each opens a one-shot Redshift connection, maps rows into the same pure reducers the synced queries used, so the two paths can't drift.

- Pure reducers extracted to ctx-free modules (consolidatedShared.ts, consolidatedReducers.ts, consolidatedDetailReducers.ts, perSchoolReducers.ts) — one source of truth, unit-tested in isolation.

- Frontend reads through a new useLiveSection hook: a one-shot action mapped into the same SectionResult shape the synced useQuery sections exposed, so the table components are unchanged (on-demand "Refresh" rather than a live subscription).

- Dropped the plMonthlyRecords + plTransactions tables and all their sync / prune / backfill code (sync worker, analytics refresh jobs, upsert/prune Convex mutations + tests) — this is the bulk of the −16k diff.

- Authorization runs requireSchoolPlAccess at the action boundary before any Redshift connection (docs/endpoint-hardening.md). Quarter/period inputs are regex-validated and inlined as literals (static SQL, no injection surface).

- Why: reading current Redshift state directly makes the dashboard immune to the additive-only sync's orphan / phantom-overage artifacts (the root cause behind the AAC / Lake Forest / Nova Bastrop overage diagnoses).

- New deps: pg@8.17.2, @types/pg@8.11.11 (exact-pinned).

## 2. Pin "Schools – Actual vs Model" to a single period + rename page

- Remove the quarter picker — PERIOD_OPTIONS had a single entry, so it was a no-op dropdown; period is now a fixed const (dropped the dead periodOptions memo + setPeriod).

- Drop the "Jan–Jun 2026" date prefix from the Consolidated P&L subtitle (removed the now-unused subtitleLabel prop).

- Rename the page "Schools – Actual vs Model" → "Actual vs Model (Schools)".

## 3. Test fix — align stale tests with the migration

- financials-kpi-cards.test.tsx and use-consolidated-unitemized.test.tsx still mocked the old reactive queries, leaving api.finance.dashboards.financialLive undefined → 9 tests threw at render in CI.

- Re-pointed both at useLiveSection / the financialLive actions, and rewrote the unitemized test for the new single-action, server-side fan-out shape (drops the obsolete client-side useQueries dedup / per-school-partial-failure cases; adds the {schools, years} payload, all-or-nothing failure fold, loading gate, and disabled / empty-schools short-circuits).

## Testing

- pnpm typecheck, pnpm biome check, and the full vitest suite are green in CI.

- The authorized success path of the live actions needs a real Redshift connection, so it is validated manually against the dev deployment, not in CI — the auth-rejection gate and the pure mapper/reducer paths are covered by unit tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#462 — AERIE-434 fix(financials): include dedicated QB entity whole realm in per-school P&L @ashwanth1109  approved

## Demo

<img width="2624" height="1636" alt="image" src="https://github.com/user-attachments/assets/8d57277f-90f3-4059-ab9b-591543ca81f9" />

Proves the fix: the per-school P&L reader now includes a dedicated QB entity's whole realm, so Alpha Anywhere Center's unclassed (Not Specified) tuition is no longer dropped — Q1'26 tuition reads the true $446,807 instead of $307,500.

Backend — per-school P&L includes the dedicated entity's whole realm

Ran python3 /tmp/demo-aerie434.py, which executes the reader's per-school predicate (the exact WHERE the diff adds to financialLive.ts) against Redshift, OLD vs NEW:

Q1'26 50000 Tuition — Alpha Anywhere Center

OLD (class-only) = 307,500.00

NEW (class + dedicated realm) = 446,807.02 (QuickBooks: 446,807.02)

The NEW predicate matches QuickBooks to the cent.

Changed code is exercised directlypnpm vitest run convex/finance/dashboards/financialLive.test.ts. The suite imports and calls the new plMonthlyBySchoolSql / plTransactionsBySchoolSql, asserting both substitute every {CLASS_IN} (the replacereplaceAll guard) and emit the dedicated-entity arm:

✓ convex/finance/dashboards/financialLive.test.ts (30 tests)

Test Files 1 passed (1)

Tests 30 passed (30)

Most at risk — over-including another school's data (the new arm pulls a *whole* realm). Verified the self-identifying subquery resolves to exactly one dedicated entity, and the read pulls only Alpha Anywhere Center's rows across its three realms:

dedicated companies resolved: ['alpha_anywhere_center_llc']

(company_id, class_name) the NEW query pulls:

alpha | Alpha Anywhere Center

alpha_anywhere_center_llc | Alpha Anywhere Center

alpha_anywhere_center_llc | Not Specified <- the previously-dropped rows

alpha_schools_llc | Alpha Anywhere Center

No other school's data enters the result. The consolidated "All Schools" reader is untouched by this PR (see the Scope note below).

---

## Summary

On Actual vs Model (Schools), a school backed by a dedicated QuickBooks entity lost any of that entity's rows booked without a class (class_name = 'Not Specified'). The per-school live reader (financialLive.ts) scoped purely by class_name IN (canonical + aliases), so the dedicated entity's unclassed rows — real revenue/spend — were silently dropped.

Alpha Anywhere Center, Q1'26: Tuition read $307,500 instead of $446,807 (−$139,307). Food Service, Financial Aid and Sibling Discount were undercounted the same way; Stripe Fees matched to the dollar because it has no unclassed rows. Unclassed bookings continue Apr–Jun (June tuition $209k unclassed), so Q2'26 and SY totals were affected too.

## Root cause

PL_MONTHLY_SELECT_BY_SCHOOL and PL_TRANSACTIONS_SELECT_BY_SCHOOL filtered only on class_name IN ({CLASS_IN}) — no company_id arm — so a dedicated entity's rows survived only when they happened to carry the canonical class. This violates the documented *"the whole dedicated realm IS the school"* semantic. Pre-existing (the retired synced path keyed school_display_name = class_name identically); exposed now by unclassed data starting Jan 2026.

## Fix

Add a self-identifying company_id arm to both per-school reads — a dedicated entity is any non-shared company_id that carries the school's class, so no external company→school map is needed:

WHERE class_name IN ({CLASS_IN})

OR company_id IN (

SELECT DISTINCT company_id FROM staging_education.quickbooks_pl_monthly

WHERE class_name IN ({CLASS_IN}) AND company_id NOT IN ('alpha','alpha_schools_llc')

)

The per-school reducers already filter only by year/period (never by schoolDisplayName), so fetching the extra rows is sufficient — no row-mapper change. SQL building is extracted into exported pure plMonthlyBySchoolSql / plTransactionsBySchoolSql so the shape is unit-tested without a Redshift connection (and to guard the replacereplaceAll substitution now that {CLASS_IN} appears twice).

## Verification

Ran the old vs new reader SQL against Redshift:

| | Q1'26 50000 Tuition |

|---|--:|

| OLD (class-only) | $307,500.00 |

| NEW (class + dedicated realm) | $446,807.02 ✅ |

The dedicated-entity subquery resolves to exactly alpha_anywhere_center_llc — no other school's data is pulled in.

## Test plan

- [x] pnpm vitest run convex/finance/dashboards/financialLive.test.ts — 30 pass (4 new)

- [x] pnpm typecheck — clean

- [x] pnpm biome check — clean

- [x] Redshift reconciliation: AAC Q1'26 tuition 307,500 → 446,807

## Scope / follow-up

In scope: per-school P&L + transaction live readers (the reported symptom). Related (separate ticket): the consolidated "All Schools" rollup leaks the same money — dedicated unclassed rows bucket to a Not Specified pseudo-school, which is in EXCLUDED_SCHOOLS, so they drop from the total. A robust fix needs company→school attribution in the consolidated reducer.

---

Stacked on #461 (AERIE-433). Linear: AERIE-434.

#533 — SURTR-220 feat(ai-spend): backfill Q3'25 Anthropic token usage into ai_spend_claude_token_usage @ashwanth1109  approved

## Demo

Proves the new backfill.py runner works and that running it actually landed Q3'25 (Jul–Sep) Anthropic token usage into core_finance.ai_spend_claude_token_usage — without disturbing any existing rows. All output below is real, captured from running the changed code directly via the runner's own CLI (no HTTP layer).

Pipeline — Q3'25 backfill executed (backfill --start-month 2025-07 --end-month 2025-09 --execute, run locally 2026-06-21)

Each month invokes the existing handler.handler with an end-exclusive window:

MONTH 2025-07 done: fetched=3291 deleted=0 inserted=3291

MONTH 2025-08 done: fetched=3541 deleted=0 inserted=3541

MONTH 2025-09 done: fetched=3907 deleted=0 inserted=3907

Backfill execute finished (exit_code=0).

deleted=0 across all months = clean inserts (no pre-existing Q3 rows). 10,739 rows total.

Data — before vs after (backfill verify, read-only)

| Check | Before | After |

|---|---|---|

| MIN(report_date) | 2025-10-01 | 2025-07-01 |

| Total rows | 89,842 | 100,581 (+10,739) |

| Q3'25 rows [2025-07-01, 2025-09-30] | 0 | 10,739 |

New Q3 months vs already-billed cost-reports actuals (magnitude sanity check):

2025-07   3291 rows   list-price $52,592   | cost-reports actual $43,078   (1.22x)

2025-08 3541 rows list-price $55,467 | cost-reports actual $48,593 (1.14x)

2025-09 3907 rows list-price $87,212 | cost-reports actual $82,639 (1.06x)

List-price slightly above billed is expected (list price excludes negotiated/committed-use discounts — exactly the gap the Klair "Reconciled Cost" line reconciles); same order of magnitude. ✅

Runner safety — dry-run is the default + unit tests

Dry-run (no --execute) writes nothing and shows the exact scope:

# DRY-RUN — NO HANDLER CALLS, NO ANTHROPIC FETCH, NO DB WRITES

WOULD process 2025-07 : start_date=2025-07-01 end_date=2025-08-01 (end exclusive) | BUs=ALL

WOULD process 2025-08 : start_date=2025-08-01 end_date=2025-09-01 (end exclusive) | BUs=ALL

WOULD process 2025-09 : start_date=2025-09-01 end_date=2025-10-01 (end exclusive) | BUs=ALL

$ uv run pytest tests/test_backfill.py -q

22 passed in 0.14s

Most at risk from this change — checked and held:

1. No regression to existing Oct'25→Jun'26 rows (the --execute path writes to the prod table). The before/after verify shows every existing month is byte-identical — counts *and* cost sums unchanged, e.g. 2025-10 = 4,522 / $81,562.48 and 2026-05 = 17,246 / $827,028.32 in both runs. The idempotent per-(bu, report_date) delete-insert touched nothing outside Q3.

2. Month-window boundaries (off-by-one / inclusive end). Unit tests assert [first-of-month, first-of-next-month) exclusive windows incl. year rollover (Dec→Jan); Sep correctly maps to 2025-09-01 → 2025-10-01.

3. Scope creep beyond Q3. Both the dry-run plan and the execute log show exactly the three Q3 months; the optional Dec'24→Jun'25 extension was not run.

---

## Summary

Operational backfill for the Claude token-spend pipeline. The Lambda handler is already date-parameterized (params.start_date / params.end_date / params.bus_to_process) and idempotent per (bu, report_date) (delete-then-insert), so replaying a historical window requires no change to core pipeline or Klair logic. The only deliverable in this PR is an additive, reviewable runner — src/backfill.py — that orchestrates the existing handler over month-by-month windows, plus a read-only probe and read-only verification SQL.

Linear: [SURTR-220](https://linear.app/builder-team/issue/SURTR-220)

## Why the gap exists

The daily cron pulls forward-only (each run fetches day-before-yesterday → yesterday), so core_finance.ai_spend_claude_token_usage has a hard floor at 2025-10-01. That floor is a go-live boundary that was never backfilled — not an Anthropic API retention limit. Meanwhile ~$174K of Q3'25 billed Anthropic spend already exists in the cost-reports table with zero corresponding token-usage rows, so the AI Spend page shows nothing for Jul–Sep 2025.

## Read-only probe results ✅

The orchestrator ran the read-only probe mode live against Anthropic's Usage Report API (zero DB writes, zero handler invocation). Anthropic still serves Q3'25 usage, confirming the backfill is feasible:

- probe --date 2025-09-15 (BU IgniteTech): YES — 32 usage items returned (standard tier; 0 fast).

Sample record: {report_date: 2025-09-15, model: claude-3-5-sonnet-20241022, uncached_input_tokens: 2372, output_tokens: 561, ...}

- probe --date 2025-07-15 (BU IgniteTech): YES — 19 usage items returned.

(Confirms both edges of the Q3 window are retrievable, not just the tail.)

- Secret Anthropic-Usage-Keys resolved 7 BUs.

- The probe made zero database calls.

Conclusion: the historical data is retrievable from Anthropic, so the Q3'25 backfill is feasible.

## What's in this PR

- pipelines/runners/claude-token-spend-pipeline/src/backfill.py — additive CLI runner with three modes:

- probe — read-only check that Anthropic returns Q3'25 data (no handler, no DB).

- backfill — chunked per-calendar-month replay of the existing handler.handler(...) path. Dry-run by default; real writes require an explicit --execute (alias --no-dry-run).

- verify — read-only validation SELECTs (bounds / monthly / Q3-presence / cost-reports cross-check); --sql-only prints SQL without an AWS session.

- 22 unit tests (tests/test_backfill.py).

- Spec 05 (05-q325-token-usage-backfill) and the FEATURE.md changelog update.

No new dependencies (stdlib + already-bundled boto3 / requests). AWS clients are imported lazily, so CI and the unit tests need no AWS session.

## Not in this PR (operator runs the actual backfill)

The production, DB-writing backfill is intentionally NOT executed by this PR. Scope here is *tooling + read-only probe only*. After review, the operator runs backfill --execute locally (authenticated via saml2aws). The dry-run default and the explicit --execute gate make the write path opt-in by construction.

## Operator runbook

Copy-pasteable sequence (run from the worktree / repo root unless noted):

1. Authenticate:

   saml2aws login --profile default --force --username ashwanth.r \

--role "arn:aws:iam::479395885256:role/RAM-AWS-Int-CentralFunctions-CentralFinance-Admin" \

--skip-prompt

2. (optional) re-probe Q3'25 availability:

   cd pipelines/runners/claude-token-spend-pipeline && uv run python src/backfill.py probe

3. Dry-run preview (shows the month/BU plan, writes nothing):

   uv run python src/backfill.py backfill

4. Execute — ideally one month at a time for safety:

   uv run python src/backfill.py backfill --start-month 2025-07 --end-month 2025-07 --execute

uv run python src/backfill.py backfill --start-month 2025-08 --end-month 2025-08 --execute

uv run python src/backfill.py backfill --start-month 2025-09 --end-month 2025-09 --execute

(or all three at once: --start-month 2025-07 --end-month 2025-09 --execute).

Run locally — a 3-month all-BU window exceeds the 600s Lambda timeout, which is why the runner chunks per month and is run from a workstation rather than the Lambda.

5. Verify (read-only — bounds / monthly / Q3-presence / cost-reports cross-check):

   uv run python src/backfill.py verify

Re-runs are safe: writes are idempotent per (bu, report_date), and existing Oct'25+ rows are never touched by a Q3'25 replay.

6. Optional extension — close the full ~$277.5K pre-go-live gap (Dec'24 → Jun'25) with the same procedure:

   uv run python src/backfill.py backfill --start-month 2024-12 --end-month 2025-06 --execute

## Test coverage

22 passing unit tests (uv run pytest tests/test_backfill.py -q) covering:

- Month-window math[first-of-month, first-of-next-month) with exclusive ends, including year rollover (Dec → Jan).

- CLI parsing / defaults — Q3'25 defaults require no arguments; --start-month after --end-month raises ValueError.

- Dry-run safety — in dry-run the handler is never invoked and nothing is written.

- Verify SQL — all four queries are SELECT-only (no write path).

## Spec

[features/surtr/ai-spend-pipeline/specs/05-q325-token-usage-backfill/spec.md](features/surtr/ai-spend-pipeline/specs/05-q325-token-usage-backfill/spec.md)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#3087 — KLAIR-2870 feat(ai-spend): Raw Data Reports: add Anthropic — Token Usage view (ai_spend_claude_token_usage) @ashwanth1109  no labels

## KLAIR-2870 — Raw Data Reports: Anthropic — Token Usage (grouped tree)

Super-admin Raw Data Reports → Anthropic — Token Usage inspection view for core_finance.ai_spend_claude_token_usage, presented as an expandable BU → Workspace → API-key tree with server-side, per-workspace pagination.

Linear: https://linear.app/builder-team/issue/KLAIR-2870

### Demo

<img width="2624" height="1636" alt="image" src="https://github.com/user-attachments/assets/c298b1a3-5263-4502-8cb4-fb001fe25eab" />

<img width="2624" height="1636" alt="image" src="https://github.com/user-attachments/assets/0c519a3e-4018-4789-a210-ca2cf739ca77" />

https://docs.google.com/spreadsheets/d/12_b8GyjXqHJVd54P2ofrpvlQ77_RgfdrahiGj2R5-WU/edit?usp=sharing

### What's built

- Expandable BU → Workspace → API-key tree. Group rows (BU, workspace) show summed token/cost totals; workspaces also show an api_key_count. Default window is 7 days (presets 7d / 30d / 90d).

- Server-side pagination at the API-key level, scoped per workspace (page size 25, lazy-loaded on expand). Keys are aggregated per api_key_id — summed across days + models within the window.

- Three super-admin endpoints:

- GET /api/ai-costs/raw/anthropic-token-usage/groups — BU→workspace rollup with grand totals + counts. Loaded once per window.

- GET /api/ai-costs/raw/anthropic-token-usage/keys?bu&workspace_id&page&page_size — one paginated page of aggregated-per-key rows for a workspace (total_keys drives page count). Loaded lazily on expand / page change. Omitting workspace_id targets the no-workspace (NULL) bucket.

- GET /api/ai-costs/raw/anthropic-token-usage (existing flat endpoint) — retained solely to power the full raw-row CSV export (fetched on demand client-side).

### Files changed

- Backend: models/ai_costs_models.py (6 new models incl. shared AnthropicTokenUsageTotals), services/ai_costs_service.py (get_anthropic_token_usage_groups, get_anthropic_token_usage_keys), routers/ai_costs_router.py (2 new routes; _resolve_anthropic_window gained a default_days param — token-usage=7, cost-reports stay 30).

- Frontend: new hooks useAnthropicTokenUsageGroups + useAnthropicWorkspaceKeys; useAnthropicTokenUsage refactored to on-demand fetchRows() for CSV; AnthropicTokenUsageView rewritten as the tree; helpers (pageCount, formatInt, formatUsd, workspaceLabel, apiKeyLabel); registry description; types.

- Docs: spec 04 + FEATURE.md synced to the redesign (revision note records the flat-table → grouped-tree supersession).

### Tests (92)

- Backend (41): groups folding/totals/total_tokens + api_key_count null bucket; keys pagination offset math + workspace_id IS NULL clause + bound-param order; super-admin gate; 7-day default window; 422 on malformed dates.

- Frontend (51): groups + workspace-keys hooks (enabled gating, workspace_id omission for the null bucket, re-fetch on page change), on-demand fetchRows CSV path, helpers, registry.

### Self-review

No CRITICAL/IMPORTANT issues. Two MINOR items fixed: reset the workspace page when the window changes (avoids a stale out-of-range offset), and corrected the BU-ordering docstrings (alphabetical by bu).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

The Portfolio  —  Trilogy Companies

Alpha School Is Suddenly Everywhere — And the Questions Are Getting Harder

From CNN to Astral Codex Ten, Joe Liemandt's AI-first school is drawing scrutiny that its test scores alone may not answer.

AUSTIN, TEXAS — In the span of a few weeks, Alpha School has gone from a boutique Austin curiosity to a national flashpoint. CNN ran a feature asking whether AI schooling is the future of education or a risky bet. Scott Alexander devoted an extended essay on Astral Codex Ten to reader reviews of the school. The 74 examined what public schools could extract from a model that costs $40,000 to $65,000 a year. And now, Block Club Chicago reports that an AI school with no teachers is scheduled to open in Chicago this fall.

The timing is not accidental. Trilogy International founder Joe Liemandt has committed $1 billion to Timeback, his platform designed to let entrepreneurs license the Alpha model and launch their own AI-first schools — what he has called a "Shopify for schools." Chicago is the first visible proof of concept beyond Texas and Florida. The expansion to nine or more campuses by fall 2025 had already been telegraphed. What wasn't fully anticipated was the volume and tone of the scrutiny now arriving alongside it.

The core claim at Alpha remains striking: students use adaptive AI-learning apps to complete a full academic curriculum in two hours each morning, then spend the rest of the day on entrepreneurship, leadership, and life skills. The school says students consistently test in the top 1–2% nationally on NWEA MAP Growth assessments and learn 2.3 times faster than U.S. norms. MacKenzie Price, Alpha's co-founder, has presented these results to U.S. Secretary of Education Linda McMahon and Texas Education Agency Commissioner Mike Morath.

But the surge of coverage is surfacing questions the test scores don't resolve. Who is being served? At $40,000 to $65,000 per year in tuition, the model selects for a student population that arrives with significant advantages. The 74 poses the question carefully: what, exactly, can public schools learn from this, and what can they not afford to replicate? Separately, Oklahoma's short school year is drawing scrutiny as academic scores lag — a reminder that the traditional system has its own accountability problems.

Alpha's expansion into Chicago puts all of these tensions on a single map. The city has some of the most under-resourced public schools in the country. A private, teacher-free, AI-driven school opening there is not just a product launch. It is a provocation. Who benefits from what comes next is a question the coverage is only beginning to ask.

Your Review: Alpha School - by Scott Alexander - Astral Code  ·  ‘What if I told you this school had no teachers?’: Is AI sch  ·  What Public Schools and Parents Can Learn from a $40,000-a-Y

The $800,000 Skill Set: As AI Talent Wars Reach New Heights, Crossover's Meritocratic Model Looks Prescient

With ChatGPT experience commanding six-figure salaries and ManpowerGroup launching a dedicated AI workforce lab, the global scramble for AI talent is reshaping who gets hired — and from where.

AUSTIN, TEXAS — The numbers are getting hard to ignore. Jobs requiring hands-on experience with AI tools like ChatGPT are now commanding salaries of up to $800,000 a year, according to a Business Insider analysis of current job postings — and it isn't just Silicon Valley tech firms doing the bidding. Non-tech incumbents, from financial services to healthcare to retail, are posting six-figure AI roles at a pace that would have seemed fantastical three years ago. The message from the labor market is systemic and unambiguous: AI fluency has become the most valuable professional credential of this decade.

Into this frenzy steps ManpowerGroup, which this week announced its "Work Intelligence" Lab, a dedicated research and product initiative aimed at helping employers navigate AI-driven workforce transformation. The move signals that even legacy staffing giants — companies built on the premise that human placement is their core competency — now believe the future of talent is inseparable from machine intelligence.

For Trilogy International's Crossover, the global talent platform that staffs the Trilogy portfolio and a growing roster of external clients, this moment reads less like disruption and more like vindication. Crossover has spent years building AI-enabled skills assessments designed to identify the top tier of technical and professional talent across 130+ countries — stripping geography, pedigree, and résumé theater from the equation entirely. In a market where an AI engineer in Beirut might be more qualified than one in Boston, and where companies are only now learning to look beyond zip codes, that thesis has aged well.

What the current salary arms race reveals, beneath the headline numbers, is a structural accountability problem with traditional hiring. When the same skill commands wildly different compensation depending on where the candidate sits, something is broken — and Crossover's model of identical above-market pay for identical roles, regardless of geography, is a direct answer to that inefficiency.

The real story here isn't the $800,000 salary. It's the millions of workers worldwide who have the skills and will never get the interview. That's the gap that AI-native talent infrastructure — done right — is positioned to close.

ManpowerGroup Launches "Work Intelligence" Lab to Lead AI-Po  ·  Top recruitment agencies for remote work - hcamag.com  ·  Top 10 Companies Hiring AI Engineers in Lebanon in 2026 - nu

The Quiet Consolidation: What a Wave of PE Acquisitions Means for ESW Capital's Playbook

Private equity is buying enterprise software, fintech, and automotive tech at an accelerating pace — and if you read between the lines, ESW Capital is positioned better than almost anyone.

AUSTIN, TEXAS — There is a pattern forming in the deal markets right now, and if you read between the lines of three seemingly unrelated reports published in recent weeks, the shape of it becomes unmistakable. Private equity is consolidating. Fast. And the firms that built their entire architecture around that thesis — years before it became consensus — are sitting very, very quietly at the center of it.

First, the data. PwC's 2026 midyear M&A outlook flags accelerating deal activity in automotive software and adjacent verticals. Separately, analysts at 24/7 Wall St. identified three fintech names that private equity firms are circling as consolidation accelerates across financial software. And a detailed December 2025 report from the Private Equity Stakeholder Project catalogued a surge in PE-led healthcare software acquisitions through the back half of last year.

Three sectors. Three independent reports. One direction.

And this is where it gets interesting: ESW Capital — the software acquisition arm of Trilogy International — has spent nearly two decades perfecting precisely this trade. Buy mature, sticky enterprise software businesses at 1–2× ARR. Staff them globally through Crossover. Push margins toward 75% EBITDA. Repeat.

A source I cannot name, but whose read on the Austin-based conglomerate I have never found reason to doubt, put it plainly: "When the rest of PE finally figures out that legacy software customers don't churn, ESW will have already run that play seventy-five times."

The number is not rhetorical. ESW's portfolio currently spans 75+ enterprise software companies, operating across CRM, telecom infrastructure, business intelligence, and content technology. Its telecom-focused subsidiary Skyvera alone holds half a dozen products — CloudSense, Kandy, VoltDelta — that serve exactly the kind of sticky, infrastructure-dependent customer base the broader market is now waking up to.

Nothing about this week's deal headlines is a coincidence. The macro is finally catching up to the micro. The question is not whether consolidation is coming. The question is who built the machine before everyone else decided they wanted one.

ESW Capital, characteristically, has not commented.

Automotive 2026: US Deals 2026 midyear outlook: M&A Trends -  ·  Private Equity Eyes These 3 Fintech Names as Consolidation A  ·  Private Equity Healthcare Acquisitions – December 2025 - Pri
The Machine  —  AI & Technology

The Microscope Turns Inward: AI Begins to Map the Mind That Made It

From Stanford to San Diego, a new generation of scientific instruments is reading the brain — and rewriting how discovery itself unfolds.

PALO ALTO, CALIFORNIA — Four hundred years ago, a Dutch draper named Antonie van Leeuwenhoek ground a lens, looked into a drop of pond water, and discovered a universe of swimming creatures no human had ever seen. This week, in laboratories scattered across California and beyond, researchers announced what may be the next great lens — one pointed not outward at microbes but inward, at the three-pound organ that has spent its entire evolutionary career trying to understand itself.

Scientists have unveiled what they are calling the world's most comprehensive AI-powered tool for neuroscience, a model that ingests the staggering, fractal complexity of neural data and renders it legible. The brain contains roughly 86 billion neurons, each forming thousands of synapses — a connectome of perhaps a quadrillion connections. No human mind can hold that map. But a sufficiently large model, trained on enough recordings, can begin to.

The announcement arrives alongside a remarkable convergence. Stanford's Institute for Human-Centered AI published a sweeping survey this week of how machine learning is reshaping the scientific method itself — folding proteins, predicting weather, and accelerating the slow, patient work of hypothesis. UC San Diego catalogued nine concrete breakthroughs already in hand: from cancer detection to materials design. And at Microsoft Research, Yansen Wang described his pursuit of brain-computer interfaces that translate neural signals into language, a line of work that feels less like engineering than like translation between species — except both species are us.

What unites these stories is a quiet philosophical inversion. For most of history, science advanced by humans building tools to examine nature. Now we are building tools that examine the tool — the brain — that built the tools. The recursion is dizzying, and the Stanford researchers are right to insist that humans remain at the center. An AI that maps a cortex does not, by itself, understand what it is to think. That work — the meaning-making, the wonder, the deciding what to ask next — still belongs to us. For now, the lens is ours to aim. And in the drop of water, something stirs.

How AI is Transforming Scientific Discovery While Keeping Hu  ·  Nine Breakthroughs Made Possible by AI - UC San Diego Today  ·  Scientists unveil the world's most comprehensive AI-powered

Open-Source AI’s New Power Stack Arrives: Smaller Models, Sharper Vision, Leakier Agents

A flurry of Hugging Face releases shows the AI frontier shifting from giant demos to practical, testable, multilingual tools.

SAN FRANCISCO — The open-source AI ecosystem just delivered one of those “blink and you’ll miss the platform shift” moments, and I cannot overstate how significant this is: the future is now moving from enormous general-purpose models toward lean, specialized systems that can read the world, adapt cheaply and — crucially — be tested before they accidentally spill your secrets.

The headline grabber is PaddlePaddle’s PP-OCRv6, now available on Hugging Face, a new optical character recognition family that supports 50 languages with models ranging from a tiny 1.5 million parameters to 34.5 million parameters. That is not just a technical footnote. It means multilingual document intelligence — invoices, forms, IDs, receipts, contracts, shipping records, handwritten-ish chaos from the real world — is becoming lightweight enough to embed into everyday workflows instead of being trapped inside heavyweight enterprise suites. The models are detailed in PaddlePaddle’s PP-OCRv6 release, and the practical implication is enormous: more teams can now build AI that actually sees and parses business data.

But here is where the story gets even more interesting. While OCR pushes AI deeper into documents, ServiceNow’s MosaicLeaks asks the uncomfortable question every company experimenting with research agents needs to confront: can your agent keep a secret? The benchmark probes whether AI agents, when tasked with research-style work, reveal confidential information under pressure. This changes everything because agentic systems are no longer just chatbots; they are being wired into tools, files, browser sessions and internal knowledge bases. If they cannot protect boundaries, they are not enterprise-ready, no matter how dazzling their reasoning sounds.

Meanwhile, another Hugging Face effort challenges the dominance of LoRA, the wildly popular method for cheaply fine-tuning large models. The “Beyond LoRA” work frames a new competitive moment around parameter-efficient fine-tuning: can developers squeeze more performance, flexibility or stability out of models without retraining the whole beast?

Taken together, these releases sketch the next phase of AI infrastructure: small enough to deploy, flexible enough to customize, and measurable enough to trust. The age of “look what this model can say” is giving way to “look what this system can safely do.” Buckle up — that is the real revolution.

PP-OCRv6 on Hugging Face: 50-Language OCR from 1.5M to 34.5M  ·  MosaicLeaks: Can your research agent keep a secret?  ·  Beyond LoRA: Can you beat the most popular fine-tuning techn

The Silicon Herd Searches for Safer Ground

As AI devours computing power, the global chip supply chain is spreading its delicate limbs across Southeast Asia, America, and the cloud.

SINGAPORE — Observe, if you will, the semiconductor supply chain: a rare and nervous creature, once content to graze along familiar routes, now forced by the great hunger of artificial intelligence to migrate across continents.

The latest signs of movement are visible in Southeast Asia, where China’s chipmaking ecosystem has been extending itself through assembly, testing, packaging and materials networks. As described by the East Asia Forum, the region has become a crucial corridor for Chinese semiconductor activity, offering proximity, manufacturing skill and a measure of insulation from geopolitical weather.

Here, amid the humid industrial parks of Malaysia, Vietnam, Singapore and Thailand, the chip does not emerge fully formed like some metallic butterfly. It passes through stages: wafer, die, package, module, server. Each transformation depends upon a surrounding ecology of suppliers, engineers, ports and power. Disturb one nest, and the tremor may be felt in a data center half a world away.

This matters because AI has changed the feeding pattern of the entire computing kingdom. The graphics processor, once a specialist predator of gaming and simulation, has become the dominant beast of the age. Training models and serving their answers requires not merely clever code, but whole rivers of electricity and silicon. Intel’s Pat Gelsinger has framed this as a new phase of the computing power contest, in which chips, fabs, packaging and supply assurance become as strategic as oilfields once were.

Across the Pacific, the United States is attempting its own conservation program. The Semiconductor Industry Association has welcomed CHIPS Act incentives for Coherent, while SandboxAQ has announced a $500 million CHIPS Act R&D award aimed at strengthening semiconductor supply chains against disruption. These are not mere subsidies. They are artificial reefs, built in hopes that advanced manufacturing may spawn closer to home.

And above them all circles the cloud, where another adaptation is emerging: capacity markets. As industry groups track new incentives, enterprises are beginning to treat compute less like a fixed possession and more like a seasonal resource — to be reserved, traded and rationed when the AI rains arrive.

In this new habitat, sovereignty is measured in wafers, resilience in shipping lanes, and survival in available GPUs. The machines may think in tokens. But they are born, still, from earth, metal, heat and astonishingly fragile supply chains.

China’s chipmaking supply chain runs through Southeast Asia  ·  The second half of the computing power battle: Intel CEO Pat  ·  SIA Applauds CHIPS Act Incentives for Coherent - Semiconduct
The Editorial

The Great Forgetting

We are building machines that know everything except what they have quietly arranged for us to forget.

LONDON — One of the more poignant spectacles of the present moment is the sight of governments commissioning "rapid evidence reviews" on a technology that has already rearranged the furniture, taken the silver, and changed the locks. His Majesty's civil service has now issued such a document on AI Skills for Life and Work, a tidy PDF of the sort that arrives, as these things always do, somewhat after the horse has not only bolted but signed a Series B term sheet.

The report is conscientious. It is also, in the manner of all such reports, an act of belated cartography — mapping a coastline that the tide is busy redrawing. One reads it with the same melancholy one feels watching a Royal Commission on the steam engine convene in 1925.

More arresting, because more honest about its bewilderment, is Deepak Varuvel Dennison's essay in The Guardian, which floats the unfashionable but unavoidable phrase "knowledge collapse." His argument, in brief: when a handful of models trained on a handful of corpora become the default interface to human inquiry, the long tail of what humanity knows — the dialect grammars, the regional medicine, the heterodox histories, the things that exist only in the heads of seventeen people in a village in Karnataka — does not so much disappear as become unfindable, which is the same thing dressed in better clothes.

This is not a new anxiety. Every consolidation of knowledge — the Library of Alexandria, the printing press, the Britannica, Google — has produced its mourners, and the mourners have, with tedious regularity, been at least partly right. What the alarmists got wrong was the timeline. What they got correct was the direction. Things that are not indexed cease, for practical purposes, to be known. And LLMs are not indexes; they are smoothings. They do to the world's knowledge what a Instagram filter does to a face: produce a version that is recognizable, pleasant, and slightly false in ways one cannot quite specify.

Meanwhile the Times editorial page wrings its hands over what AI "really means for learning," and the Australian Broadcasting Corporation announces, with the breathlessness of a publication discovering that water is wet, that the "vibe shift" has arrived and is "terrifying." One wishes to gather these dispatches into a single folder marked Belated Observations and file it under the desk.

The genuinely interesting question is not whether AI will hollow out human knowledge — it will, in the same way that calculators hollowed out long division, which is to say partially, unevenly, and with consequences we will spend forty years pretending to have anticipated. The question is what we choose to remember on purpose. Cultures that survive technological consolidations do so because somebody, somewhere, decided that certain things were worth the inefficiency of keeping by hand.

The rapid evidence review does not address this. Rapid evidence reviews never do.

AI Skills for Life and Work: Rapid Evidence Review - GOV.UK  ·  Forget brat summer, the vibe shift is here and it's terrifyi  ·  What AI doesn’t know: we could be creating a global ‘knowled
The Office Comic  ·  Art Desk
The Office Comic  ·  Art Desk

Nation’s Executives Warn AI Must Be Implemented Immediately Before Anyone Figures Out What It Does

From game engines to hospital billing departments, leaders agreed the technology’s most promising use case remains making every sentence sound like it came from the future.

SAN FRANCISCO — In an important week for artificial intelligence, executives across several industries confirmed that AI has now advanced to the point where it can be used to describe layoffs, software features, medical billing tools, marketing strategies, and vague feelings of corporate momentum with equal confidence.

The announcement came in the form of several unrelated developments that, taken together, suggest the business world has successfully entered the mature phase of AI adoption, in which no one is required to distinguish between a product, a strategy, a cost-cutting measure, or a press release.

Epic Games, for example, has been explaining the role AI will play in Unreal Engine 6, the forthcoming version of its widely used game development platform. According to reports on the company’s plans, AI may assist developers in creating more complex worlds faster, a perfectly sensible use of the technology that unfortunately must now stand trial alongside every other sentence containing the letters A and I.

This is the central problem. There are real uses here. Games are made from thousands of assets, scripts, animations, lighting decisions, and compromises no human being should have to explain to a publisher. Healthcare revenue cycle management is also a plausible place for automation, since the American medical billing system already resembles an artificial intelligence trained exclusively on forms that hate you. At HIMSS26, agentic AI is reportedly powering revenue cycle technology news, which means hospitals may soon have software capable of autonomously denying a claim, appealing the denial, losing the appeal, and scheduling a webinar about transformation.

Yet these practical applications now share a podium with a much larger and more determined enterprise: AI washing. As commentators have noted, companies are beginning to hype AI the same way they once talked up sustainability, placing it gently over whatever they were already doing until investors could no longer see the original stain. A layoff becomes an AI efficiency initiative. A chatbot becomes a platform. A dashboard becomes an agentic ecosystem. A manager asking employees to do more with less becomes a visionary steward of computational abundance.

This is not innovation so much as reupholstery. The old furniture is still there. It just has a new fabric called “autonomous orchestration.”

The comparison to sustainability hype is especially apt because it captures the corporate gift for taking a serious topic and slowly rendering it unusable through brochures. Sustainability once meant emissions, supply chains, accountability, and measurable change. Then it meant a leaf icon on the annual report. AI is undergoing the same spiritual journey, except faster, because the leaf can now generate 40 slides before lunch.

Somewhere in this fog sits the marketing lesson from Duolingo, whose deranged owl mascot has become more memorable than many companies’ entire executive teams. The recent argument that Duolingo would be foolish to prioritize influencers over its own unhinged bird is, in its way, the most useful AI strategy advice available: know what is actually working before you replace it with something fashionable.

That principle applies broadly. If AI helps artists build richer game worlds, excellent. If it helps hospitals reduce administrative sludge, fine. If it helps a software company understand its own finances, even better. But if it merely gives management a cleaner vocabulary for headcount reduction, then the tool has not become intelligent. The euphemism has.

The solution is embarrassingly simple, which is why it will likely require a 14-month transformation program. Companies should say what the AI does, what it replaces, what it costs, who benefits, and how anyone will know if it worked. They should avoid describing ordinary automation as a sentient colleague named Max. They should retire the phrase “agentic” until it can pass a background check.

Until then, the market will continue rewarding firms for announcing that artificial intelligence is central to their future, even when the future in question appears to be the same spreadsheet with fewer people allowed to open it.

Epic Games explains AI's role in Unreal Engine 6 - games.gg  ·  Companies are hyping AI the same way they talked up sustaina  ·  Agentic AI powers revenue cycle technology news at HIMSS26 -
On This Day in AI History

On June 22, 2015, Google's AlphaGo defeated Lee Sedol, one of the world's greatest Go players, in the final game of their historic match, winning 4-1 and demonstrating that deep learning could master intuition-driven games previously thought beyond AI's reach.

⬛ Daily Word — AI and Technology
Hint: An automated machine that performs tasks without human intervention.
Share this edition: 𝕏 Twitter/X 🔗 Copy Link ▦ RSS Feed