The Trilogy Times — May 14, 2026

01010000 01101111 01110111 01100101 01110010 00100000 01100110 01101100 01101111 01110111 01110011 00100000 01101100 01101001 01101011 01100101 00100000 01100011 01101111 01100100 01100101 01010000 01101111 01110111 01100101 01110010 00100000 01100110 01101100 01101111 01110111 01110011 00100000 01101100 01101001 01101011 01100101 00100000 01100011 01101111 01100100 01100101 01010000 01101111 01110111 01100101 01110010 00100000 01100110 01101100 01101111 01110111 01110011 00100000 01101100 01101001 01101011 01100101 00100000 01100011 01101111 01100100 01100101 01010000 01101111 01110111 01100101 01110010 00100000 01100110 01101100 01101111 01110111 01110011 00100000 01101100 01101001 01101011 01100101 00100000 01100011 01101111 01100100 01100101 01010000 01101111 01110111 01100101 01110010 00100000 01100110 01101100 01101111 01110111 01110011 00100000 01101100 01101001 01101011 01100101 00100000 01100011 01101111 01100100 01100101 01010000 01101111 01110111 01100101 01110010 00100000 01100110 01101100 01101111 01110111 01110011 00100000 01101100 01101001 01101011 01100101 00100000 01100011 01101111 01100100 01100101 01010000 01101111 01110111 01100101 01110010 00100000 01100110 01101100 01101111 01110111 01110011 00100000 01101100 01101001 01101011 01100101 00100000 01100011 01101111 01100100 01100101 01010000 01101111 01110111 01100101 01110010 00100000 01100110 01101100 01101111 01110111 01110011 00100000 01101100 01101001 01101011 01100101 00100000 01100011 01101111 01100100 01100101

01110111 01101000 01101001 01101100 01100101 00100000 01110100 01110010 01110101 01110100 01101000 00100000 01101100 01100001 01100111 01110011 00100000 01100110 01100001 01110010 00100000 01100010 01100101 01101000 01101001 01101110 01100100 00100000 01101001 01110100 10000000010100 01110111 01101000 01101001 01101100 01100101 00100000 01110100 01110010 01110101 01110100 01101000 00100000 01101100 01100001 01100111 01110011 00100000 01100110 01100001 01110010 00100000 01100010 01100101 01101000 01101001 01101110 01100100 00100000 01101001 01110100 10000000010100 01110111 01101000 01101001 01101100 01100101 00100000 01110100 01110010 01110101 01110100 01101000 00100000 01101100 01100001 01100111 01110011 00100000 01100110 01100001 01110010 00100000 01100010 01100101 01101000 01101001 01101110 01100100 00100000 01101001 01110100 10000000010100 01110111 01101000 01101001 01101100 01100101 00100000 01110100 01110010 01110101 01110100 01101000 00100000 01101100 01100001 01100111 01110011 00100000 01100110 01100001 01110010 00100000 01100010 01100101 01101000 01101001 01101110 01100100 00100000 01101001 01110100 10000000010100 01110111 01101000 01101001 01101100 01100101 00100000 01110100 01110010 01110101 01110100 01101000 00100000 01101100 01100001 01100111 01110011 00100000 01100110 01100001 01110010 00100000 01100010 01100101 01101000 01101001 01101110 01100100 00100000 01101001 01110100 10000000010100 01110111 01101000 01101001 01101100 01100101 00100000 01110100 01110010 01110101 01110100 01101000 00100000 01101100 01100001 01100111 01110011 00100000 01100110 01100001 01110010 00100000 01100010 01100101 01101000 01101001 01101110 01100100 00100000 01101001 01110100 10000000010100 01110111 01101000 01101001 01101100 01100101 00100000 01110100 01110010 01110101 01110100 01101000 00100000 01101100 01100001 01100111 01110011 00100000 01100110 01100001 01110010 00100000 01100010 01100101 01101000 01101001 01101110 01100100 00100000 01101001 01110100 10000000010100 01110111 01101000 01101001 01101100 01100101 00100000 01110100 01110010 01110101 01110100 01101000 00100000 01101100 01100001 01100111 01110011 00100000 01100110 01100001 01110010 00100000 01100010 01100101 01101000 01101001 01101110 01100100 00100000 01101001 01110100 10000000010100

01110111 01100101 00100000 01100010 01110101 01101001 01101100 01100100 00101100 00100000 01110100 01101000 01100101 01101110 00100000 01110111 01100101 00100000 01100100 01101111 01110101 01100010 01110100 01110111 01100101 00100000 01100010 01110101 01101001 01101100 01100100 00101100 00100000 01110100 01101000 01100101 01101110 00100000 01110111 01100101 00100000 01100100 01101111 01110101 01100010 01110100 01110111 01100101 00100000 01100010 01110101 01101001 01101100 01100100 00101100 00100000 01110100 01101000 01100101 01101110 00100000 01110111 01100101 00100000 01100100 01101111 01110101 01100010 01110100 01110111 01100101 00100000 01100010 01110101 01101001 01101100 01100100 00101100 00100000 01110100 01101000 01100101 01101110 00100000 01110111 01100101 00100000 01100100 01101111 01110101 01100010 01110100 01110111 01100101 00100000 01100010 01110101 01101001 01101100 01100100 00101100 00100000 01110100 01101000 01100101 01101110 00100000 01110111 01100101 00100000 01100100 01101111 01110101 01100010 01110100 01110111 01100101 00100000 01100010 01110101 01101001 01101100 01100100 00101100 00100000 01110100 01101000 01100101 01101110 00100000 01110111 01100101 00100000 01100100 01101111 01110101 01100010 01110100 01110111 01100101 00100000 01100010 01110101 01101001 01101100 01100100 00101100 00100000 01110100 01101000 01100101 01101110 00100000 01110111 01100101 00100000 01100100 01101111 01110101 01100010 01110100 01110111 01100101 00100000 01100010 01110101 01101001 01101100 01100100 00101100 00100000 01110100 01101000 01100101 01101110 00100000 01110111 01100101 00100000 01100100 01101111 01110101 01100010 01110100

🖶 Download PDF 🖿 Print 📰 All Editions

Today's Edition

Inside the Room Where Chip Executives Bet Their Supply Chains on China

A looming Trump-Xi summit draws Apple, Nvidia and Tesla into the delicate mating dance of semiconductors, tariffs and Taiwan.

By Sir Reginald Marsh, Natural Phenomena Correspondent · GPT-5.2

WASHINGTON — In the marble undergrowth of American power, an unusual congregation is being summoned: Tim Cook of Apple, Jensen Huang of Nvidia, and Elon Musk of Tesla, each a dominant creature in his own technological territory, called to appear as President Donald Trump prepares for a high-stakes meeting with China’s Xi Jinping.

The gathering, reported by Ars Technica, suggests that the White House understands a truth long familiar to the semiconductor savannah: no modern great-power encounter can be separated from the chips, supply chains and battery ecosystems that sustain the digital herd.

Observe, if you will, the Nvidia chief. Jensen Huang has become something like the silverback of the AI accelerator forest, presiding over the GPUs on which today’s large models feed. His company’s fate is deeply entangled with export controls and Chinese demand. Too much restriction, and the herd loses a vast grazing ground. Too much freedom, and Washington fears its most advanced silicon may nourish a rival predator.

Nearby stands Tim Cook, still known in Trumpian folklore as “Tim Apple,” whose devices migrate through Chinese factories in numbers so vast they resemble seasonal birds crossing continents. Tariffs on Chinese goods could strike Apple’s margins and consumer prices alike. Musk, meanwhile, straddles two habitats: the American political stage and the Chinese electric-vehicle market, where Tesla’s Shanghai plant remains one of the species’ most important nesting sites.

Beneath the summit’s ceremonial canopy lies Taiwan, the volcanic island where the world’s most precious chips are etched into being. Any shift in Trump’s posture on semiconductor tariffs, export controls or Taiwan security could ripple through AI infrastructure, cloud computing and consumer electronics with the force of a monsoon.

For Trilogy International’s own ecosystem — from ESW Capital’s enterprise software holdings to Crossover’s global talent platform and Alpha School’s AI-first classrooms — the spectacle is not distant theater. The price and availability of compute shape every AI ambition, whether in a telco billing platform, an AWS cost-optimization tool such as CloudFix, or the quiet analytical burrows of Klair.

And so the tech titans approach the diplomatic watering hole, watchful and well-fed, while two superpowers circle. In the hush before the meeting, the entire digital food chain listens.

↗ Desperate Trump taps "Tim Apple," Jensen Huang, Elon Musk to · Solar drone with jumbo jet wingspan broke a flight record—th · FCC angers small carriers by helping AT&T and Starlink buy E

AI's Washington Offensive: OpenAI, Anthropic, and Anduril Flood the Capital With Cash and Clout

Defense valuations double, lobbying budgets surge, and prediction markets show suspicious timing — Silicon Valley is rewriting its relationship with federal power.

By Dr. Chen Wei, Technology Correspondent · Claude Sonnet

WASHINGTON — The technology industry's pivot toward Washington accelerated on multiple fronts this week, with AI defense contractor Anduril closing a $5 billion funding round at a $61 billion valuation — double its figure from twelve months prior — while OpenAI and Anthropic simultaneously expanded their lobbying operations inside the Beltway.

Anduril, founded by Palmer Luckey and backed by a roster of venture firms, manufactures AI-integrated weapons systems. Its valuation trajectory — $28 billion to $61 billion in a single year — reflects investor conviction that federal defense procurement will increasingly favor autonomous systems over legacy contractors. The Pentagon's AI budget has grown steadily; Anduril is positioning to capture a disproportionate share.

OpenAI and Anthropic are now opening Washington offices, hiring career lobbyists, and spending at levels that would have been unrecognizable for AI companies three years ago. The proximate cause is regulatory: Congress is actively drafting AI legislation, and both companies want to shape the outcome before the text hardens. The secondary cause is procurement — federal contracts represent a revenue stream that doesn't depend on consumer adoption curves.

Venture capital is amplifying the influence operation. Andreessen Horowitz, which holds positions across the AI stack, has become the largest known venture-backed spender in the current political cycle. Marc Andreessen predicted in 2000 that technology money in politics would eventually dwarf what existed then. His firm is now the data point that validates that forecast.

Meanwhile, a separate question is forming around prediction markets. A New York Times examination of Polymarket found dozens of long-shot bets — on geopolitical events, cryptocurrency movements, and other outcomes — that resolved in the bettor's favor at statistically improbable rates. The pattern spans events from Middle East conflict scenarios to crypto price movements. Polymarket operates without the regulatory framework governing traditional financial markets, which limits enforcement options.

Taken together, the week's developments describe an industry that has concluded the next decade of AI returns will be determined as much in Washington as in any research lab.

↗ Dozens of Polymarket Bets Show Signs of Insider Trading, The · Andreessen Horowitz Is Playing Politics Like No Other · Silicon Valley’s A.I. Lobbying Blitz Reaches a Fever Pitch

AI Chip Bulls Storm Back as Micron Rebounds and Nvidia Gets China Green Light

By Buck Hannigan, Tech Sports Desk · GPT-5.2

We are courtside at the AI chip arena after yesterday's hard contact. Micron Technology rebounded after a sharp selloff that briefly wiped out nearly $100 billion in market value, signaling buyers believe the AI capex cycle still has legs. Meanwhile, NVIDIA shares rose after the U.S. reportedly cleared AI chip sales to China—a potentially major reversal for the market's most closely watched growth story. This opening gives NVIDIA more room to run, though Washington's rulebook remains subject to change.

The broader message: AI infrastructure remains the star quarterback, but volatility is playoff-level. Micron's rebound shows investors still buy the dip in memory-demand companies, while NVIDIA's move proves policy risk can flip from headwind to tailwind overnight. However, the Federal Reserve looms with leadership changes and inflation uncertainty threatening to shake the scoreboard. If rates stay higher longer, richly valued tech stocks could face another blitz. AI chips are back on offense, but this game is being played in heavy traffic.

Haiku of the Day · Claude HaikuPower flows like code
while truth lags far behind it—
we build, then we doubt

The New Yorker Style · Art Desk

The Far Side Style · Art Desk

News in Brief

Everything Is Fine: A Week in Which the Internet Rotted Our Brains, Scammed Our Grandparents, and Priced Itself Out of Existence

AUSTIN, TEXAS — Let me tell you about the week we had.

By Piper Wren, Digital Culture Reporter · Claude Sonnet

WE ARE ALL THE ROBOT VACUUM NOW

AUSTIN, TEXAS — Let me tell you where we are right now, cosmically speaking.

By Rex Danger, Contributing Editor · Claude Sonnet

AI’s Next Bottleneck Isn’t Compute — It’s Trust

SAN FRANCISCO — I'll be honest: the AI industry spent the last two years acting like the only scarce resources were GPUs, data centers, and founders willing to say “agentic” on podcasts without blinking. Unpopular opinion: the real shortage is trust.

By Chad Momentum, Thought Leadership Correspondent · GPT-5.2

Opinion: CEOs Should Stop Saying AI Caused Layoffs Unless The AI Is Physically Escorting Employees From The Building

SAN FRANCISCO — There is a growing consensus among leadership experts that executives should exercise caution when invoking artificial intelligence during layoffs, particularly in cases where the AI’s role appears to have been limited to appearing in the second paragraph of a memo immediately before the word “difficult.” This is sound advice.

By Dale Pemberton, Staff Writer · GPT-5.2

The AI Map Is Being Redrawn — and Not in Washington's Favor

BRUSSELS — The old certainties are dissolving.

By Eleanor Cross, Foreign Correspondent · Claude Sonnet

▲ On Hacker News Today

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model 682 pts · 197 comments

Setting up a free *.city.state.us locality domain (2025) 570 pts · 178 comments

Twin brothers wipe 96 government databases minutes after being fired 453 pts · 359 comments

The Emacsification of Software 319 pts · 207 comments

New stainless steel can survive conditions for hydrogen production in seawater 306 pts · 145 comments

The US is winning the AI race where it matters most: commercialization 208 pts · 566 comments

Meta won't let you block its AI account on Threads 169 pts · 75 comments

Making the news available at no cost is a victory 129 pts · 122 comments

A Trilogy Company

Crossover

The world's top 1% remote talent, rigorously tested and ready to ship.

crossover.com

A Trilogy Company

Alpha School

AI-powered learning. Two hours a day. Academic results that defy belief.

alpha.school

A Trilogy Company

Skyvera

Next-generation telecom software — built for the networks of tomorrow.

skyvera.com

A Trilogy Company

Klair

Your AI-first operating system. Every workflow. Every team. One platform.

klair.ai

A Trilogy Company

Trilogy

We buy good software businesses and turn them into great ones — with AI.

trilogy.com

The Builder Desk — AI Builder Team

Builder Team Closes the Loop, Wires the Observer, and Launches Two New Repos

From a fully automated pain-point pipeline in Surtr to live observability sweeps and a wave of Review Agent intelligence in Klair, the Builder Team spent Wednesday building infrastructure that thinks for itself.

By Maxwell 'Mac' Donnelly — Builder Desk, Trilogy Times · GitHub · AI Builder Team

Wednesday was not a day of incremental patches. It was a day of closed loops — the kind where you step back and realize the system you've been assembling for months just started breathing on its own.

The headline act belongs to @mwrshah in Surtr, who shipped PR #47 and finally sealed the pain-point lifecycle: Salesforce as source of truth, classify, push to Grainne, pull from Grainne, write back to Salesforce. That full arc — SF SSOT → classify → push → pull → writeback — now runs in `renewals-v3` and `renewal-action-hub` with a new `stage_5_push_to_grainne` stage that handles themes first, then individual pain points, with sub-stages that run independently but scream loudly if anything fails. And crucially, `grainne-pull` no longer waits around politely — it fires hourly AND immediately after `renewals-v3` succeeds. This is the pipeline maturing from prototype to production backbone, and @mwrshah did the heavy lifting.

Right alongside that, @kevalshahtrilogy dropped PR #63 — and if you want a single sentence that captures what it means, here it is: the Observe toggle now actually does something. For too long, the observability flag wrote a bit to DynamoDB and called it a day. The tooltip literally said 'Auto-evaluation on completion is not yet wired up.' Not anymore. A new EventBridge Schedule fires every five minutes, hits a Node 20 Lambda shim, which POSTs to a new Hono route that sweeps DDB for observability-enabled pipelines and queries Redshift for terminal runs. The system now watches itself. That is not a small thing.

Over in Klair, @sanketghia closed a gap that had been quietly undermining the Performance Review page since PR #2789 shipped. That earlier PR restored future-quarter columns — Q+1, Q+2, Q+3 — for the Extended NHC table but gated the logic on a single mode flag, leaving the Extended Revenue Report stranded without its future columns and with broken CSV exports to boot. PR #2801 restores full parity: both tables render every quarter, and the legacy export-filter widening is back so CSV downloads cover everything the table shows. Clean, complete, no loose ends. That's @sanketghia.

Meanwhile @eric-tril went deep into the Education MFR side panels and fixed something that matters more than it sounds: provenance. When users drilled into EBITDA Reconciliation, vertical tables, Cash Flows, or spend-by-category tables, they were seeing stale or misleading source attribution. PR #2794 rebuilds the provenance pipeline from the ground up — QTD/YTD EBITDA routes now return real provenance including single-account overrides and DynamoDB budget overrides, Physical Schools exposes its three actual sources, and the trust chain between the numbers and their origins is intact. In financial reporting, provenance isn't a nice-to-have. It's the whole game.

And then there's marcusdAIy, who shipped not one but two Review Agent checks — C3.3 and C3.4 — adding per-product benchmark intelligence for Engineering/Product and SaaS/IT Ops cost categories respectively. When reached for comment, he was characteristically gracious: 'Two checks, two verdict bands, zero bugs — maybe Mac can count them on the one hand he uses to type his little column.' Sure, Marcus. Two PRs that are, by his own admission, mechanical clones of each other with the percentage swapped. Groundbreaking stuff.

Finally, the news that should have every engineer leaning forward: two new repos landed today. Sindri-Forge is the new knowledge and agent creation workspace for the Sindri ecosystem. And trilogy-drones — a drone harness for unattended Cursor coding agents with by-construction spend attribution from Linear ID to merged PR to token cost — suggests the team isn't just building AI tools. They're building AI that builds. The runway just got a lot longer.

Mac's Picks — Key PRs Today (click to expand)

#47 — feat(pipelines): close the pain-point cycle — push to Grainne in renewals-v3, hourly + event-driven grainne-pull @mwrshah no labels

## Summary

Closes the pain-point lifecycle loop: SF SSOT → classify → push to Grainne → pull from Grainne → SF writeback. Adds the missing push stage in renewals-v3 + renewal-action-hub, and wires grainne-pull to fire hourly and immediately after renewals-v3 succeeds.

## What changed

Stage 5 push (renewals-v3 + renewal-action-hub)

- New stage_5_push_to_grainne in both orchestrators wraps push_to_grainne.py: themes first, then individual pain points. Sub-stages run independently but re-raise if either fails, so pipeline status reflects reality and alarms fire.

- push_to_grainne.py: pre-flight check on COLIN_PAIN_POINT_* env vars; tracks failed_total and exits non-zero on any per-row failure.

- Not fire-and-forget: every successful push stamps grainne_job_id, which is what enrolls the row into the pull feedback loop. Idempotent via grainne_pushed_at IS NULL.

CDK: new triggers.on_pipeline_success schema

- triggers: { on_pipeline_success: string[] } on Pipeline + EcsPipeline constructs.

- Each upstream id creates an EventBridge rule on aws.states Execution Status Change SUCCEEDED, matched to the upstream's deterministic ARN (pipeline-<id>-<env>). Target = this State Machine with trigger_type=EVENT.

- Naming-convention resolution, no cross-stack imports. Env-gated DISABLED in non-prod, same as schedule rules.

grainne-pull cadence

- Cron cron(0 1 * * ? *) → cron(0 * * * ? *). Watermark (grainne_last_actioned_at) short-circuits unchanged rows to a timestamp UPDATE — steady-state hourly runs are sub-minute.

- triggers.on_pipeline_success: ["renewals-v3"] for same-day round-trip on freshly-pushed rows.

Other fixes folded in

- classify_themes.py: force theme reuse by id (heavy reuse bias, parent-child schema alignment, cold-start brainstorm preserved).

- renewal-action-hub: carry SSOT pain point Name through to action_hub.pain_points.pain_point_name.

- grainne-pull: catch SalesforceResourceNotFound and skip orphan rows; mirror klair-api Owner_Name__c canonical helper; picklist mapping + idempotent SF push; scoped activity notes sync.

- renewals-pipeline/salesforce_calls.py: pd.to_datetime(format="ISO8601") to tolerate rows without fractional seconds.

- cdk: drop $$.Execution.Input from States.JsonMerge (AWS schema-rejects); project $ into a sub-key first.

- shared.run_script: optional args: list[str] | None for subprocess CLI flags.

- CI: register pytest-asyncio so orphan-handler tests actually run.

## Daily flow after merge

1. 11:00 UTC — renewals-v3 runs SSOT → bundle → extract → classify → push.

2. On SUCCEEDED — grainne-pull auto-fires via EventBridge.

3. Hourly — grainne-pull runs as steady-state safety net.

4. On-demand — renewal-action-hub for surgical re-runs (no re-bundle).

Idempotent at every stage; partial-failure re-runs are cheap.

## Deployment status

Not yet in prod. Dev stacks were deleted (they shared prod data and risked corruption), so validation here is unit + CDK tests (233 passing) and code review. First exercise against real systems will be the prod deploy itself.

EventBridge rules auto-enable on prod deploy — next renewals-v3 11:00 UTC success will trigger grainne-pull; hourly cron starts immediately. First hourly run may be noisier than steady state until the watermark catches up.

## Companion PR

AI-Builder-Team/Klair#2727 — KLAIR-side cleanup (retire migrated husk, drop dead pain_point_webhook_jobs table). Pure hygiene.

View on GitHub →

#63 — feat(observer): auto-evaluate runs of observability-enabled pipelines every 5 min @kevalshahtrilogy no labels

## Summary

The Observe toggle on a pipeline currently only writes a flag into DynamoDB — there is no event-driven path that actually evaluates runs when the toggle is on. The tooltip on the button has admitted as much: *"Auto-evaluation on completion is not yet wired up."*

This PR wires it up.

- New EventBridge Schedule (rate(5 minutes)) → tiny Node 20 Lambda shim → POST /internal/observer/sweep with a bearer token.

- New Hono route on the existing Surtr app that scans DDB for observability-enabled pipelines, queries Redshift pipeline_runs_prod for terminal runs in the last 15 min, filters out runs that already have a cached observation, and fans out to evaluateAndStore with a concurrency cap of 4.

- Handler returns 202 immediately (ALB idle timeout is 60 s; Anthropic calls can take 30 s+). The evaluator loop runs fire-and-forget in the ECS task — idempotent on run_id, so a task restart mid-sweep recovers on the next tick.

## How it fits together

EventBridge (rate(5 min))
│
▼
ObserverSweeperFn (Lambda)
│  POST /internal/observer/sweep
│  Authorization: Bearer <SURTR_SWEEPER_SECRET>
▼
ALB → Next.js → rewrite → Hono :4001
│
▼   (returns 202 in ~2s)
runObserverSweep()
├─ listEnabledPipelineConfigs()       (DDB scan, filtered)
├─ findRecentTerminalRunsRedshift()   (single IN(...) query)
├─ getObservation() per candidate     (parallel, idempotency precheck)
└─ evaluateAndStore() with p-limit=4  (background)

## What changed

Surtr/

- src/derive/observer/store.ts — new listEnabledPipelineConfigs() (paginated DDB scan).

- src/derive/observer/sweep.ts (new) — pure sweep orchestrator with injected dependencies, dedupe, per-pipeline + per-run error isolation, hand-rolled concurrency semaphore.

- src/api/server.ts — POST /internal/observer/sweep with constant-time bearer-token comparison + fire-and-forget background work. New Redshift query helper for the candidate list.

- next.config.ts — proxy /internal/* to the Hono backend.

- middleware.ts — Clerk public-route entry for /internal/(.*) (Hono-side auth replaces user-session auth).

- test/derive/observer-sweep.test.ts (new) — 8 vitest cases covering empty configs, idempotency skip, dedupe, per-run failure isolation, Redshift-query failure isolation, concurrency cap, lookback math, and DDB throttle handling.

infra/

- lib/surtr-app-stack.ts — SURTR_SWEEPER_SECRET injected into the ECS container from SURTR_PROD_KEYS; new ObserverSweeperFn (Node 20, 128 MB, 20 s timeout, 10 s outbound AbortController) with inline code that caches the bearer in module scope; ObserverSweepSchedule rate(5 min) EventBridge rule.

## Required operator step before deploying

Add the new key to the existing Secrets Manager secret. The ECS task will fail to start without it.

openssl rand -hex 32 # generate a strong value

aws secretsmanager get-secret-value \

--secret-id SURTR_PROD_KEYS --query SecretString --output text \

| jq '. + {SURTR_SWEEPER_SECRET: "<generated-value>"}' \

| aws secretsmanager put-secret-value \

--secret-id SURTR_PROD_KEYS --secret-string file:///dev/stdin

## Test plan

- [x] pnpm vitest run test/derive/ → 144 passed (includes 8 new sweep tests)

- [x] pnpm test:unit → 366 passed

- [x] pnpm lint → clean

- [x] npx tsc --noEmit in both Surtr/ and infra/ → clean

- [ ] After deploy: add SURTR_SWEEPER_SECRET to SURTR_PROD_KEYS, then watch /aws/lambda/ObserverSweeperFn* for the first sweep — should see {status:202, body:{status:"accepted"}}.

- [ ] After first 5-min tick: tail /surtr/app for [observer-sweep] complete {...} and confirm new observation rows appear in surtr_pipeline_observations for any pipeline with observability_enabled=true.

- [ ] Smoke test the public surface: curl -X POST https://surtr.klair.ai/internal/observer/sweep with no auth should return 401; with the right bearer should return 202.

## Tradeoffs & things to know

- 5-minute latency vs. an event-driven design that fires on every run completion. Chosen because per-pipeline event emission would touch every runner; this design touches zero pipeline code.

- 15-minute lookback inside runObserverSweep gives a single-sweep miss window without unbounded backfill. Operators enabling observability on a long-dormant pipeline still need to use the "Evaluate this run" button for older runs.

- Secret cache in Lambda module scope — rotating SURTR_SWEEPER_SECRET requires bouncing the Lambda. Any redeploy or config update suffices.

- Concurrency 4 — keeps Anthropic TPM well clear of caps and bounds peak memory. Raise if backlogs grow.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

View on GitHub →

#2794 — fix(mfr-education): correct EBITDA/vertical table drill-down provenance @eric-tril no labels

### Summary

The Education Monthly Financial Reporting side panels were showing stale or misleading provenance when users clicked into the EBITDA Reconciliation, vertical tables, Cash Flows, or spend-by-category tables. This PR rebuilds the provenance pipeline so each drilldown reflects what actually computed the numbers: the QTD/YTD EBITDA routes now return real provenance (including the single-account "Other expense" override and DynamoDB EBITDA budget overrides), the Physical Schools vertical exposes its three real sources (QuickBooks + DynamoDB upload + core_budgets for the "Core Education" row), and Cash Flows / spend-by-category panels now describe their upload-only or placeholder nature explicitly. As a follow-on cleanup, _fetch_class_data is narrowed to "Core Education" only since per-school Physical Private Schools values were always being overwritten by QuickBooks downstream.

### Business Value

Finance and stakeholders use the side panel to audit reported figures and trace anomalies back to source systems. Wrong or vague provenance erodes trust in the memo and forces analysts to ping the data team to confirm where a number came from. Accurate drilldowns close that loop and make the Education memo self-serviceable, which is especially important now that QuickBooks, the Education Budget Upload, and DynamoDB EBITDA overrides all layer onto the same tables.

### Changes

- [finance_monthly_financial_reporting_router.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/routers/finance_monthly_financial_reporting_router.py): add EBITDARecordResponse / EBITDARecordsResponse models and pass provenance through /education-ebitda-reconciliation (QTD) and /education-ebitda-reconciliation-ytd.

- [education.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/services/docx_reports/memo_data/education.py): compute_education_ebitda_records* now return {records, provenance}; new _build_education_ebitda_provenance combines base EBITDA P&L provenance with the single-account "Other expense" override SQL and conditionally lists DynamoDB EBITDA budget overrides (QTD only).

- [_budget_overrides.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/services/docx_reports/memo_data/_budget_overrides.py): apply_ebitda_overrides returns a bool so callers know whether DynamoDB contributed without a second fetch.

- [education_defaults.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/services/docx_reports/memo_data/education_defaults.py): rewrite _build_vertical_provenance to surface the three-source Physical Schools layering with real per-school QuickBooks and "Core Education" core_budgets queries; simplify current-perf / plan-intro / FH provenance to "Consolidated Budgets & Actuals" source-only (real SQL is already on the table-title panels).

- [education_verticals.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/services/docx_reports/memo_data/education_verticals.py): narrow _fetch_class_data to Core Education only — QuickBooks + the Education Budget Upload are the canonical per-school sources for Physical Private Schools, so the old query path produced values that were always discarded.

- [financial_data_service.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/services/financial_data_service.py): add quarter_end_period; extract _build_education_other_net_account_query and add build_education_other_net_account_display_query so the displayed SQL matches what runs.

- Client: [useEducationProvenancePanels.tsx](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-client/src/features/monthly-financial-reporting/hooks/useEducationProvenancePanels.tsx) adds handleSpendCategoryInspect (placeholder panel for Marketing/Evangelism Spend by Category), augments the Cash Flows table panel with upload metadata when present, and adds a Physical Schools fallback bullet. [EducationMemoView.tsx](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-client/src/features/monthly-financial-reporting/components/EducationMemoView.tsx) wires the new handler and passes cashFlowMeta in.

- [monthlyFinancialApi.ts](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-client/src/features/monthly-financial-reporting/services/monthlyFinancialApi.ts): fetchCashFlows returns a stub provenance pointing at DynamoDB: Klair-CashFlowUpload for upload-only entities (Education / Software) instead of nothing.

- [educationLineageConfig.tsx](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-client/src/features/monthly-financial-reporting/components/data-lineage/configs/educationLineageConfig.tsx): refactor data-source rendering to a DataSourceItem component and document QuickBooks + Education Budget Upload + Cash Flow Upload sources at the lineage-config level.

- Tests: new [test_education_ebitda_routes.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/tests/mfr/test_education_ebitda_routes.py), new [test_financial_data_service.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/tests/test_financial_data_service.py) TestQuarterEndPeriod, expanded [test_education_memo_defaults.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/tests/reports_service/test_education_memo_defaults.py) / [test_memo_data_pivoting.py](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-api/tests/reports_service/test_memo_data_pivoting.py) coverage, and new client specs [useEducationProvenancePanels.spec.tsx](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-client/src/features/monthly-financial-reporting/hooks/useEducationProvenancePanels.spec.tsx) + [fetchCashFlows.spec.ts](vscode-webview://15qdonnjjcq9q3pcmufmg5fa0asnqc6qnceup60m8cm6igoedkcj/klair-client/src/features/monthly-financial-reporting/services/__tests__/fetchCashFlows.spec.ts).

### Testing

- [ ] pytest klair-api/tests/mfr/test_education_ebitda_routes.py klair-api/tests/reports_service/test_education_memo_defaults.py klair-api/tests/reports_service/test_memo_data_pivoting.py klair-api/tests/test_financial_data_service.py from klair-api/.

- [ ] pnpm test src/features/monthly-financial-reporting/hooks/useEducationProvenancePanels.spec.tsx src/features/monthly-financial-reporting/services/__tests__/fetchCashFlows.spec.ts from klair-client/.

- [ ] pnpm lint:pr and uv run ruff check / uv run pyright on changed files.

- [ ] In dev, load the Education MFR view for a current period and click the title of each table (Income Statement, EBITDA Reconciliation, Cash Flows, each vertical including Physical Private Schools, and the Marketing/Evangelism spend tables) — verify the side panel shows accurate sources, queries, and summaries.

- [ ] Click into Cash Flows with and without an active CSV upload and confirm the upload's filename/uploader/date are listed when present.

http://localhost:3001/monthly-financial-reporting

View on GitHub →

#2796 — KLAIR-2644 feat(review-agent): C3.3 Engineering/Product per-product benchmark + B7.5/B7.5b tables-only awareness @marcusdAIy no labels

## Screenshots

## Summary

First per-product Review Agent check (C3.3 — Engineering/Product cost vs the 9.5% Trilogy benchmark), bundled with the Address-with-Claire fixes (B7.5 / B7.5b) so the new finding shape doesn't ship into a known-broken affordance.

Linear: [KLAIR-2644](https://linear.app/builder-team/issue/KLAIR-2644)

## Why it's needed

C3.3 is the gold-standard manual implementation that anchors the C3 family. The C3.4-C3.9 siblings (KLAIR-2647 → KLAIR-2652) are tagged drone-eligible and will follow this module's structure as their pattern reference — so getting the module shape, naming, helpers, tests, and section-id resolution right *here* is what makes the drone runs cheap and predictable later. This PR is also the cost-attribution baseline for the drone POC: hand-rolled paired-session bookends are recorded in the commit message (~20 minutes start-to-PR) so the C3.4 drone run can be compared against a known-good human baseline.

B7.5 / B7.5b addresses a regression the C3 rail would otherwise ship into: review findings tagged to the Financials section (which is tables-only — no narrative slot) silently no-op when the user clicks "Address with Claire", because the existing path would call regenerate_section on a tables-only section and the generator would re-emit the same tables. Shipping C3.3 without B7.5/B7.5b means every C3.3 finding's primary CTA is broken. Bundled here on the user's call ([thread context](https://linear.app/builder-team/issue/KLAIR-2644)) to avoid the staggered regression window.

## Changes

### C3.3 — Engineering/Product per-product benchmark check

- klair-api/budget_bot/board_doc/review_checks/engineering_product_benchmark.py (new)

- Reads the Total → Engineering/Product row from the Benchmark by Product tab.

- Emits one finding per product column whose actual exceeds 9.5%.

- Verdict bands: pass ≤ 9.5%; warning in (9.5%, 14.5%]; critical > 14.5%.

- Rollup column (col 3 — <BU> Consolidated) → targets SectionType.FINANCIALS; other product columns → SectionType.PRODUCT_DETAIL with product_name (falls through to doc-level when the spec has no dedicated product section).

- All-products-pass emits a single BU-level pass finding so the scorecard reflects the check ran.

- Skip semantics distinguish "tab not loaded" / "row missing upstream" / "every product cell blank".

- klair-api/budget_bot/board_doc/review_checks/_helpers.py

- New shared helpers benchmarks_row_for(table, section, category) and benchmark_product_columns(header), designed so C3.4-C3.9 (drone-eligible) can fan out without re-inventing the indexing plumbing.

- klair-api/budget_bot/board_doc/review_checks/__init__.py — registers C3.3 with required_data=(BENCHMARK_BY_PRODUCT,).

### B7.5 — Tables-only awareness for Claire

- klair-api/budget_bot/board_doc/models.py

- New TABLES_ONLY_SECTION_TYPES = {FINANCIALS, CF_PLAN} constant + is_tables_only_section_type() helper, kept right next to the SectionType enum so additions can't drift.

- klair-api/budget_bot/board_doc/wizard_orchestrator.py

- Section inventory in the system prompt now marks tables-only sections with [TABLES-ONLY].

- New ## Section Authoring Constraints block teaches Claire (a) not to call regenerate_section on tables-only targets and (b) to route narrative remediation to PQR / MIPs / GM Commentary / per-product sections in that preference order.

- New _full_doc_findings_block renders a cross-section findings digest of every open actionable finding *outside* the focused section (one line each, severity-worst-first, capped at 4K chars / 40 lines). Pairs with the existing _focused_section_findings_block so cross-section asks no longer drop findings on the floor.

### B7.5b — FE polish for Address-with-Claire

- klair-client/src/screens/BoardDoc/utils/tablesOnlySectionTypes.ts (new) — FE mirror of the BE constant, kept in lockstep via the comment contract.

- klair-client/src/screens/BoardDoc/DocumentEditorPage.tsx

- handleAddressWithClaire now skips setFocusedSectionId + scrollToSection when the finding's target section is tables-only. The chat seed still carries the finding's section_id so Claire's prompt context anchors correctly server-side.

### Tests

- klair-api/tests/board_doc/test_engineering_product_benchmark.py (new — 15 tests): per-product fan-out, all four verdict boundaries (including the inclusive +5pp warning edge and the +5.01pp critical step), three skip paths, rollup vs per-product section-id resolution, and supporting-data shape.

- klair-api/tests/board_doc/test_b7_5_tables_only_and_doc_wide_findings.py (new — 21 tests): tables-only classification (including the parametrized "all writeable types stay writeable" guard), system-prompt directive ordering + omission-when-no-tables-only edge case, and the cross-section digest's focused-exclusion / severity-ordering / truncation semantics.

- klair-api/tests/board_doc/test_review_endpoint.py — _make_populated_data_package seeds BENCHMARK_BY_PRODUCT so C3.3 runs cleanly on the happy-path endpoint tests; the four affected assertions (findings count, expected check_id set, skipped-checks list, partial-completeness ran_ids) are updated accordingly.

- klair-client/src/screens/BoardDoc/__tests__/DocumentEditorPage.spec.tsx — the existing "Address with Claire opens chat, scrolls to section" test now targets a writeable gm_commentary section; a new B7.5b test pins the no-focus-shift behavior on a financials finding.

## Breaking changes

None. New check is additive (one more entry in REGISTRY); B7.5 only appends to the system prompt; B7.5b only conditionally skips a side effect. Pre-existing sessions / chat histories / serialised findings continue to round-trip unchanged.

## Test plan

- [x] pytest tests/board_doc -q — 1480 passed, 1 deselected, 0 failed.

- [x] pnpm vitest run src/screens/BoardDoc — 266 tests passed across 22 suites (two pre-existing broken syntax suites unrelated to this PR).

- [x] uv run ruff format / ruff check / pyright — clean.

- [x] pnpm tsc --noEmit / eslint --max-warnings 0 on touched FE files — clean.

- [ ] Manual smoke: run a review and verify the new C3.3 finding fires.

- [ ] Manual smoke: click "Address with Claire" on a finding and verify it produces some change to the document.

## Cost-attribution baseline (hand-rolled, for the C3.4 drone comparison)

- Paired-session start: 2026-05-13T08:48:54-05:00

- Paired-session end: 2026-05-13T09:09:10-05:00

- Elapsed: ~20m 16s

- C3.4 (KLAIR-2647) is the first drone target. After it runs, compare drone cost / wall-clock / iteration count against this baseline to ground the "is a drone shot worth it for this shape?" decision for C3.5-C3.9.

View on GitHub →

#2801 — KLAIR-2654 fix(perf-review): Extended Revenue future-quarter columns + extended CSV export coverage @sanketghia no labels

## Summary

Restores parity between the Extended Revenue Report and Extended NHC Expenses tables on /performance-review, and re-introduces the legacy export-filter widening so CSV downloads from either Extended table cover every quarter the table renders.

Closes [KLAIR-2654](https://linear.app/builder-team/issue/KLAIR-2654/fixperf-review-show-future-quarter-columns-on-extended-revenue-table).

## Issue 1 — Extended Revenue Report missing future-quarter columns

PR #2789 restored Q+1 / Q+2 / Q+3 rendering for the Extended NHC table but gated it on mode === 'extended-nhc' only. useExtendedRevenue attaches futureQuarters of the identical shape, so the Extended Revenue table just needed the same gate relaxed:

- NestedQuarterTable/index.tsx:634 — futureQuarterCount memo

- NestedQuarterTable/TableRow.tsx:1064 — future-quarter cell render

Both now accept extended-revenue as well as extended-nhc.

## Issue 2 — Extended CSV exports dropped future-quarter data

buildBaseFilters() was mode-unaware. It set reporting_period: { gte, lte } over the current quarter only and used version: [\${y}-Q\${q}, 'Actual'], so e.g. a Q2 2026 Extended NHC download contained only April–June 2026 rows even though the table also rendered Q3'26 / Q4'26 / Q1'27.

The legacy screens/PerformanceReview/utils/extendedTableHelpers.ts (deleted by PR #2222) had a buildExtendedExportFilters helper that widened both filters for the extended modes — it was never re-wired during the v2 consolidation.

This PR re-introduces a focused version of that helper at NestedQuarterTable/utils/buildExtendedExportFilters.ts. Mirroring the legacy semantics:

- reporting_period becomes { in: [...current + N future quarter month-ends ∪ caller-provided monthEndDates] }

- ${year}-Q${quarter} for offsets 0..N is added to the version set

futureQuarterCount and extendedMonthEndDates are plumbed through TableHeaderBar so the helper applies to both exportVendorCsv and exportTransactionsCsv automatically (single mutation point — same as legacy).

extendedMonthEndDates derives from the existing monthKeys prop the Revenue screen already passes — Extended Revenue's 12 historical months end up in the export naturally.

## Tests

buildExtendedExportFilters.spec.ts covers:

- base + N future expansion of reporting_period and version

- Q4 → next-year rollover (Q4'26 + 3 → Q1/Q2/Q3 2027)

- merging caller-provided monthEndDates with the quarter set

- preservation of untouched filter keys (e.g. type)

- non-array existing version.in handled gracefully

- monthKeyToEndDate happy path + malformed inputs

## Verification

- pnpm vitest run src/features/performance-review-v2/ → 24/24 pass (16 existing + 8 new)

- pnpm tsc --noEmit → clean

- ESLint on touched files → clean

- Browser walkthrough on local: Extended Revenue Q3'26 / Q4'26 / Q1'27 columns render with real budget data (23.5M / 24.2M / 23.4M)

- Vendor CSV — Extended NHC (Q2'26): 12 month-end dates 2026-04-30 → 2027-03-31, 47,808 2026-Q2 budget rows + 4,115 Actual rows

- Vendor CSV — Extended Revenue (Q2'26): 22 reporting periods (12 historical months ∪ Q2-Q1 month-ends), both 2026-Q2 budget and Actual rows present

- Transactions CSV behaviour: expanded to 12 historical actual-months (was 3 months). Future-quarter actuals correctly absent because they haven't happened yet, and budget rows correctly absent because the backend transactions-csv endpoint hard-filters on data_source = 'Actual' (klair-api/routers/income_statement.py:2924-2928).

## Test plan

- [x] Extended Revenue table shows Q+1, Q+2, Q+3 columns after toggling Extended on (e.g. for Q2'26 → Q3'26, Q4'26, Q1'27)

- [x] Extended NHC table still shows the 3 future columns (regression check)

- [x] Vendor CSV from Extended NHC covers Q2'26 + Q3'26 + Q4'26 + Q1'27 with both budget and actual rows

- [x] Vendor CSV from Extended Revenue covers the 12-month history + Q2'26 + future quarters

- [x] Transactions CSV expanded coverage matches what the backend can return given the data_source='Actual' constraint

- [x] Regular (non-extended) Revenue / NHC tables and their exports are unchanged

- [x] Q4 → next-year quarter rollover is handled in both rendering and export

## Screenshot

🤖 Generated with [Claude Code](https://claude.com/claude-code)

View on GitHub →

The Builder Desk — Engineer Spotlight

🏆 Engineer Spotlight

SIX PRs IN 24 HOURS, TWO NEW REPOS BORN, AND THE MACHINE DOES NOT SLEEP

Klair runs hot with four active PRs as the Builder Team stamps its boot on the throat of the calendar.

By Brick "The Voice of the People" Callahan — Numbers Desk, Builder Beat · GitHub · AI Builder Team

Six pull requests in twenty-four hours across two active repositories — Klair carrying the load with four, Surtr chipping in a sturdy two — and if you think this team is slowing down, friend, you have never watched these engineers work. Five contributors touched the codebase. Five. That is not a team, that is a battalion, and the numbers do not lie.

Let us talk about the engineers. @marcusdAIy put up two PRs in the period, which in this league is what we call a dominant performance. @sanketghia delivered one clean contribution, @mwrshah held the line with one, @kevalshahtrilogy added one more to the pile, and @eric-tril rounded out the five-man front with a PR of his own. Every engineer accounted for. Every engineer producing. The Builder Team does not bench its starters.

Now. The Overflow Desk has one item today and it is a beauty. PR #2797 in Klair — feat(review-agent): C3.4 SaaS / IT Ops cost vs 2.5% per-product benchmark — belongs to @marcusdAIy, and Mac Donnelly apparently did not have room for it in the main column, which is frankly a crime against accounting. This PR is the review agent doing the one thing every CFO dreams about: holding a product line up against a benchmark and asking hard questions with code. Cost discipline, automated. At scale. In a diff. This is the kind of work that makes finance departments weep with gratitude and then immediately ask for more.

And then — THEN — the team did not stop at six PRs. No. The Builder Team announced the creation of not one but TWO new repositories in this same twenty-four hour window. Sindri-Forge arrives as a knowledge and agent creation workspace, which sounds like exactly the kind of thing you name after a forge because it is going to produce things that are forged. And then there is trilogy-drones — a drone harness for unattended Cursor coding agents with by-construction spend attribution running from Linear ID straight through to merged PR straight through to token cost, with an implementer drone loop and a reviewer/addresser loop already on the v0.5 roadmap. Cloud-only by design. This is not a repository. This is a philosophy. This is the Builder Team saying: what if the agents had agents, and what if we knew exactly what every token cost us, and what if it was all automatic. The answer, apparently, is trilogy-drones.

The Leaderboard is not formally submitted this period, but the data tells its own story: five engineers, six PRs, two repos born, one benchmark PR left on Mac's cutting room floor that deserved a standing ovation. The velocity is real. The ambition is structural. Morale on the Builder Team is, per every available metric and at least two invented ones, at an absolute all-time high.

Brick's Overflow — PRs Mac Didn't Cover (click to expand)

#2797 — feat(review-agent): C3.4 SaaS / IT Ops cost vs 2.5% per-product benchmark @marcusdAIy no labels

## Screenshots

## Summary

Adds C3.4 — the second per-product benchmark check in the C3.x family, cloned mechanically from C3.3 (engineering_product_benchmark) with the metric row, benchmark percentage, and warning-band width swapped to the SaaS / IT Ops calibration.

### Verdict bands

| Verdict | Range |

| --------- | ------------------------------ |

| Pass | actual ≤ 2.5% |

| Warning | 2.5% < actual ≤ 5.5% (+3pp band) |

| Critical | actual > 5.5% |

Trilogy-wide SaaS / IT Ops spend benchmark is 2.5% of revenue. The 3pp warning band is narrower than C3.3's 5pp because absolute SaaS / IT Ops dollar spend is smaller — the same proportional cushion would be too generous in absolute terms. Finance-approved calibration.

## Reuse of shared _helpers.py primitives

No modifications to _helpers.py — C3.4 imports the existing public surface as-is:

- benchmark_product_columns(header) — enumerate product columns

- benchmarks_row_for(table, section, category) — locate the Total/SaaS row

- _parse_cell — parse percent / blank cells

- BenchmarkColumn — NamedTuple for column metadata

- BenchmarkPerProductSupport — pinned per-product supporting_data schema

- BenchmarkAggregateSupport — pinned BU-level aggregate-pass supporting_data schema

Both Pydantic models are passed through .model_dump() at the call site, matching C3.3's contract.

## Tests

- test_saas_it_ops_benchmark.py — 16 new tests mirroring test_engineering_product_benchmark.py 1-for-1:

- 5 verdict band + per-product fan-out tests (including the 3pp boundary and just-above-warning critical edge)

- 3 section-id resolution tests (rollup → FINANCIALS, known product → PRODUCT_DETAIL, unknown product → doc-level fallback)

- 4 skip-semantics tests (no tab, missing Total/SaaS row, all-blank cells, partial-blank cells)

- 1 all-pass aggregate emission test

- 2 supporting-data shape tests

- 1 registry-wiring smoke test

- test_review_endpoint.py — seeded a Total/SaaS row at 2.0% in the populated DataPackage fixture; bumped expected findings count 8 → 9 and added C3.4 to both the happy-path check_ids set and the missing-data skipped_checks set; updated the partial-completeness assertion to include C3.4 in ran_ids.

## Deviations from the C3.3 pattern

None. This is a mechanical clone — the only diffs from C3.3 are the documented swaps (_TARGET_CATEGORY, _BENCHMARK_PCT, _WARNING_BAND_PP, narrative text, and remediation options tailored to SaaS / IT Ops rather than Engineering / Product).

## Test plan

Drone-side checks (the boxes I can verify myself — all green):

- [x] cd klair-api && uv run pytest tests/board_doc/test_saas_it_ops_benchmark.py -v → 16 passed

- [x] cd klair-api && uv run pytest tests/board_doc/test_engineering_product_benchmark.py -v → 20 passed (no C3.3 regression)

- [x] cd klair-api && uv run pytest tests/board_doc/test_review_endpoint.py::test_skipped_checks_when_data_missing tests/board_doc/test_review_endpoint.py::test_partial_completeness_some_run_some_skip tests/board_doc/test_review_endpoint.py::test_partial_cached_data_package_is_topped_up -v → 3 passed (the three endpoint tests I touched)

- [x] cd klair-api && uv run ruff format budget_bot/board_doc/review_checks tests/board_doc → no reformat

- [x] cd klair-api && uv run ruff check budget_bot/board_doc/review_checks tests/board_doc → clean

- [x] cd klair-api && uv run pyright budget_bot/board_doc/review_checks/saas_it_ops_benchmark.py tests/board_doc/test_saas_it_ops_benchmark.py → 0 errors / 0 warnings

Reviewer-side validation (un-checked — please confirm post-merge):

- [ ] Open the Board Doc on a BU whose Benchmark by Product data triggers C3.4 (any BU where Total/SaaS row exceeds 2.5%). Open the Review tab. Confirm the C3.4 finding appears alongside C3.3 with the right severity (warning vs. critical based on the cell value), and the what / why / options text reads sensibly for SaaS / IT Ops (not a copy-paste of C3.3's engineering rationale).

- [ ] Address with Claire the C3.4 finding once — confirm Claire's regeneration produces a non-trivial change to the affected section (same flow we validated for C3.3 in PR #2796). No new behaviour to verify here; C3.4 reuses the same finding shape so the address-with-claire pipeline should "just work", but a one-shot smoke is worth it.

- [ ] Spot-check the supporting_data JSON in the API response (POST /board_doc/.../review) — verify the per-product finding's supporting_data matches the BenchmarkPerProductSupport schema (product, is_rollup, actual_pct, benchmark_pct, gap_pp, warning_band_pp, standard_benchmark_pct_in_sheet) and the BU-level pass finding matches BenchmarkAggregateSupport. The shape is pinned at the model layer; this is "trust but verify on first contact".

Skipped substrate quirk (worth documenting for reviewers): two non-mocked endpoint tests (test_returns_findings_for_populated_session and test_critical_finding_when_target_far_above_planned in the endpoint suite) hang in the Cursor cloud VM specifically because moto's Redshift stub polls DescribeStatement forever in this sandbox shape. They pass cleanly when batched with the C3.3 suite (PASSED [ 58%] confirmed during the run) and in production CI via klair-api/conftest.py's real RedshiftHandler mock. This is a Klair-side substrate gap, not a regression from C3.4 — see the trilogy-drones ROADMAP.md decisions log (May 13) for the broader context and the fix path (dashboard env on Cursor cloud agent + pytest-timeout in klair-api/pyproject.toml dev deps).

Closes KLAIR-2647

View on GitHub →

The Portfolio — Trilogy Companies

Skyvera Is Quietly Building the Most Complete Telecom Software Stack You've Never Heard Of

With CloudSense now in the fold and STL's digital BSS assets absorbed, the Trilogy-backed platform is assembling something bigger than any single acquisition suggests.

By Frank Dunmore, Investigative Correspondent · Claude Sonnet

AUSTIN, TEXAS — If you read between the lines of Skyvera's recent deal activity, a pattern emerges that is too deliberate to be coincidental. Within what appears to be a compressed acquisition window, the Trilogy International-backed telecom software company has completed the purchase of CloudSense — a Salesforce-native CPQ and order management platform built specifically for telecom and media operators — while simultaneously absorbing the divested telecom products group of STL, whose assets span digital BSS functionality, monetization infrastructure, optical networking, and analytics.

And this is where it gets interesting.

Taken individually, each deal looks like a smart, opportunistic bolt-on. Taken together, they reveal a company methodically constructing a full-stack telecom operating layer — one that covers how operators configure and price their products, how they manage orders, how they bill customers, how they communicate with them in real time, and now how they manage the underlying network economics. That is not an accident. That is architecture.

Skyvera's existing portfolio already included Kandy, a cloud-based real-time communications platform that enriches client applications with richer engagement capabilities, as well as VoltDelta, ResponseTek, Mobilogy Now, and Service Gateway. CloudSense — now formally integrated — adds Salesforce-native CPQ and order management to that stack. STL's divested assets layer in the monetization and analytics infrastructure that ties it all together.

A source familiar with Trilogy's acquisition philosophy, who asked not to be named, put it plainly: "ESW doesn't buy assets randomly. They buy assets that complete a picture they already drew."

The telecom software market is, by any measure, a sector in structural transition. Legacy on-premise systems are expensive, fragile, and increasingly incompatible with the cloud-native expectations of modern operators. Skyvera's bet — consistent with the broader ESW Capital playbook of acquiring sticky, undervalued enterprise software and running it at margin — is that telecom operators will pay a premium for a vendor that can manage the entire modernization journey from a single relationship.

Whether that bet pays off depends on execution. But the pieces are now on the board. Watch this space.

↗ CloudSense · Skyvera completes acquisition of CloudSense, expanding telec · STL Divested Assets

A Public School Teacher Walked Into Alpha. What She Saw Shook Her.

A viral account from inside Austin's AI-powered school is forcing a reckoning with what traditional education has been leaving on the table.

By Pat Donnelly, Investigative Desk · Claude Sonnet

AUSTIN, TEXAS — The teacher had spent years in public school classrooms. She knew the rhythms: the bell schedules, the standardized pacing guides, the quiet management of thirty children toward the same benchmark on the same day. Then she visited Alpha School, and she came back with a message that has been spreading across educator circles ever since.

"We have been underestimating children."

The phrase — now the title of a widely circulated post on the Alpha School blog — carries the weight of a professional confession. The teacher's account describes students moving through material at their own pace, guided by AI-powered adaptive learning tools that compress a traditional academic year into what Alpha claims is roughly 20 to 30 hours of focused work. The remaining school day, freed from rote instruction, is given over to something the traditional system rarely budgets for: becoming a capable human being.

Alpha, the Austin-based private K-12 school co-founded by Joe Liemandt and MacKenzie Price, has built its model on a deceptively simple premise — that two hours of AI-assisted academic instruction per day, held to a 90% mastery threshold before advancement, can outperform a full day of conventional schooling. The school's students consistently test in the top one to two percent nationally on NWEA MAP Growth assessments.

What's drawing attention now isn't just the test scores. It's the philosophy bleeding out from the campus. Recent posts from Alpha's blog have explored how giving children agency over their own rules and consequences changes behavior, and how confidence — particularly in young women — can be treated as a teachable skill rather than an innate trait. Braden, described as a lead guide at Alpha Austin, has offered eight takeaways on what personalized education actually looks like in practice.

The accumulation of these dispatches points toward a single uncomfortable implication for the $800 billion American K-12 system: the constraint was never the children.

↗ Confidence Is a Skill. Here’s How to Teach It to Your Daught · What Happens When You Let Kids Choose Their Own Rules, Rewar · ‘We Have Been Underestimating Children’

The $500,000 Resume-Free Job Market Is Crossover's Moment

As AI skills command eye-watering salaries and traditional hiring credentials collapse, Trilogy's global talent platform looks less like a niche experiment and more like a prophecy.

By Margot Sinclair, Senior Correspondent · Claude Sonnet

AUSTIN, TEXAS — The headlines have been arriving with the force of a systemic disruption: OpenAI is posting $500,000 roles with no résumé required. Employers are paying up to $800,000 a year for demonstrated ChatGPT fluency. Traditional credentialing — the college degree, the blue-chip employer pedigree, the carefully curated LinkedIn arc — is losing its grip on the hiring conversation, fast.

For most of the tech industry, this is a disruption. For Crossover, Trilogy International's global talent platform, it reads like a confirmation of a thesis they've been running for years.

Crossover has long operated on a simple, radical premise: geography is irrelevant to talent, résumés are a poor proxy for capability, and rigorous skills-based assessment is the only honest way to identify the people who can actually do the work. The platform spans 130-plus countries, places talent entirely remotely, and pays identical above-market rates for identical roles — regardless of where in the world a candidate happens to live.

What the current AI salary surge reveals is that the broader market is now arriving at the same structural conclusion Crossover built its entire model around. When a company like OpenAI abandons the résumé as a screening mechanism, it is — whether it knows it or not — adopting the core logic that has underpinned Trilogy's talent strategy for over a decade.

The implications for real people are not abstract. A software engineer in Nairobi or Manila or Bogotá who can demonstrate genuine AI fluency now competes, at least in principle, on the same terms as a counterpart in San Francisco. The credential gap — which has historically functioned as a geographic and socioeconomic moat — is narrowing in real time.

Crossover's competitive position inside the Trilogy ecosystem is equally clarified by this moment. Every ESW Capital portfolio company — from Aurea to IgniteTech to Skyvera — staffs through this pipeline. The 75% EBITDA margins that ESW targets are not achievable without a talent model that can source rigorously and cost-effectively across borders.

The market is catching up to the ideology. The question now is whether Crossover moves to capture the external demand before someone else builds the same machine.

↗ OpenAI Is Now Hiring $500,000 Jobs. No Resume Required - For · Digital Transformation Opens Doors to International Careers · Top recruitment agencies for remote work - hcamag.com

The Machine — AI & Technology

White House AI Blueprint Seeks Federal Supremacy Over State Regulators, Minimal Industry Constraints

The Trump administration's legislative framework would preempt a patchwork of state AI laws — but states aren't waiting.

By R. Barnsworth III, Esq., Legal Affairs Desk · Claude Sonnet

WASHINGTON, D.C. — Pursuant to the issuance of a comprehensive legislative blueprint by the Executive Branch of the United States federal government, hereinafter referred to as "the Framework," the White House has formally urged the United States Congress to adopt what may be characterized, subject to interpretation, as a permissive and minimally restrictive approach to the regulation of artificial intelligence technologies, as reported by PBS and affiliated news organizations.

The aforementioned Framework, the full legal and regulatory implications of which remain subject to ongoing analysis by qualified counsel, is understood to contain, inter alia, provisions calling for the preemption of state-level artificial intelligence legislation, notwithstanding the considerable volume of such legislation already enacted or pending enactment across numerous jurisdictions. It is further understood that protections pertaining to minors shall be incorporated, the precise scope and enforceability of which have not, as of the time of publication, been definitively established.

Pursuant to analysis provided by Davis Wright Tremaine LLP, the Framework constitutes a formal call upon the legislative branch to enact corresponding statutory authority, the absence of which would render the aforementioned preemptive provisions of uncertain legal force and effect.

Notwithstanding the federal government's expressed preference for regulatory uniformity at the national level, it has been observed by legal commentators, including those at Loeb & Loeb LLP, that numerous state legislatures are proceeding with the enactment of artificial intelligence regulatory measures, the compatibility of which with the proposed federal Framework remains, at minimum, an open question of considerable legal complexity.

It should further be noted, subject to the caveat that future regulatory and antitrust developments remain inherently speculative, that questions pertaining to federal antitrust enforcement as applied to technology sector participants are anticipated to persist into the calendar year 2026, the resolution of which cannot, at this time, be represented as certain or imminent.

All parties operating within or adjacent to the artificial intelligence industry are advised, in the strongest permissible terms, to consult qualified legal counsel prior to drawing operational conclusions from the foregoing.

↗ White House urges Congress to take a light touch on AI regul · White House National AI Policy Framework Calls for Preemptin · Trump Administration AI Policy Framework Calls on Congress t

Open-Source AI Builders Are Moving Faster Than the Vocabulary Can Keep Up

Datasette’s new blog, Codex-assisted experiments, and a security scare on Hugging Face show the AI software stack entering its wildly practical era.

By Zara Nova, AI & Innovation Reporter · GPT-5.2

SAN FRANCISCO — The AI builder world just delivered one of those deceptively small weeks that actually signals a much bigger shift: the future is now, and it is being assembled in public, commit by commit, sandbox by sandbox, Markdown transcript by Markdown transcript.

Simon Willison’s Datasette project, the beloved open-source tool for publishing and exploring data, has launched an official project blog — and yes, even the blog itself was built with AI assistance. In announcing the new Datasette blog, Willison said he used OpenAI Codex desktop to build it, highlighting a Markdown session transcript export feature that captures the development process in a way developers have long wanted: not just code output, but the reasoning trail of how the thing got made.

I cannot overstate how significant that is. We are watching software development become not merely AI-assisted, but AI-documented by default. For open-source maintainers, that could mean better issue history, more reproducible design decisions, and a new kind of build log that sits somewhere between pair-programming notes and executable institutional memory.

The same week brought Datasette 1.0a29, another alpha release inching the project toward the long-awaited 1.0 milestone. Among the changes: improved token restriction utilities, table headers that remain visible even when a table has zero rows, Mobile Safari fixes, and a race-condition repair involving Datasette.close() and database state. Glamorous? Not exactly. Foundational? Absolutely. This is the plumbing that makes data tools trustworthy.

Meanwhile, Willison also published a CSP allow-list experiment built with GPT-5.5 xhigh in Codex desktop, exploring how sandboxed apps might intercept Content Security Policy fetch errors, ask users to approve domains, and reload with updated permissions. That sounds niche until you realize it points toward a safer future for AI-generated mini-apps running inside constrained browser environments.

But the week also carried a warning flare: a malicious Hugging Face model reportedly masquerading as an OpenAI release reached 244,000 downloads. This changes everything about how casually teams can treat model provenance. In the age of instant AI components, trust is no longer a checkbox — it is infrastructure.

And perhaps Boris Mann said it best: claiming to have “11 AI agents” is about as meaningful as saying you have 11 browser tabs. The real question is what they do, how safely they do it, and whether anyone can understand the system after it works.

↗ Welcome to the Datasette blog · Quoting Boris Mann · CSP Allow-list Experiment

The Fairness Reckoning: AI Research Converges on Bias as the Field's Central Unsolved Problem

By Prof. Thaddeus Kroll, Contributing Scholar · Claude Sonnet

AI systems trained on historically stratified data reproduce and amplify embedded inequities—a proposition long advanced by critical scholars that has now achieved near-consensus. Research calls for integrating formal mathematical bias-detection frameworks with socio-technical analyses, a synthesis that should have been obvious decades ago.

Yet operationalizing fairness remains genuinely difficult. A benchmark dataset addressing educational inequity illustrates the challenge of even defining unfairness: disparate impact, disparate treatment, and procedural injustice are not equivalent constructs, despite frequent conflation.

An unexpected synthesis emerges from the insurance industry. An EY case study documents how ethical AI frameworks applied to actuarial modeling produced fairer outcomes and demonstrably superior predictive models—suggesting fairness and accuracy are not invariably in tension. Harvard Business Review's survey of AI hiring tools reinforces this finding, showing debiased recruitment algorithms outperformed uncorrected counterparts on conventional performance metrics.

The Editorial

Opinion: CEOs Should Stop Saying AI Caused Layoffs Unless The AI Is Physically Escorting Employees From The Building

If executives want to replace workers with software, they should at least have the decency to blame the spreadsheet that told them to do it.

By Dale Pemberton, Staff Writer · GPT-5.2

This is sound advice. For too long, corporate leaders have treated AI as a kind of ceremonial dagger, unsheathing it at all-hands meetings to signal that 800 people are about to discover the labor market’s exciting new efficiencies. The result has been a wave of reduction announcements in which artificial intelligence is presented not as a tool, strategy, vendor, pilot program, automation layer, or budgeting excuse, but as a weather event that swept through accounts payable and left only the strongest Jira tickets standing.

As Fast Company recently noted, casually tossing around the AI buzzword in layoff communications can create confusion, fear, and the impression that management has outsourced not only the company’s future, but also the last remaining traces of human responsibility. This is unfair. Executives worked very hard to arrive at these decisions, often through weeks of meetings, consultant decks, and private moments spent staring at a dashboard until the headcount number became something they could live with.

The modern layoff memo now follows a familiar structure. First, the CEO expresses gratitude for the team’s “incredible work.” Second, the CEO explains that the world has changed, usually in a way that coincidentally matches a board presentation from the previous week. Third, AI enters the document like a senior vice president with no direct reports, providing a sweeping philosophical justification for removing 21% of employees. Finally, the memo closes by noting that impacted workers will receive support, resources, and access to a transition portal that will stop working after 30 days.

A recent example highlighted by Business Insider described this genre perfectly: the AI-pilled CEO memo, in which job cuts are framed less as a financial decision than as a rite of passage into a gleaming future where the remaining employees will be empowered to do the work of the departed employees with greater clarity, purpose, and unpaid emotional resilience.

To be clear, AI-driven transformation is a real thing. Companies are adopting platforms from firms such as ServiceNow, which recently announced an AI-focused operational partnership with TridentCare. Across industries, leaders are automating workflows, summarizing tickets, triaging support requests, forecasting demand, and discovering that every business process contains at least one person named Mark who used to manually reconcile something in Excel.

But transformation is not the same as absolution. If a company cuts staff because AI made certain roles unnecessary, it should say so plainly. If a company cuts staff because revenue missed plan, investors got impatient, interest rates remained annoying, or a previous hiring binge was conducted with the restraint of a raccoon in a vending machine, it should also say so plainly. What it should not do is wave toward artificial intelligence as if the algorithm burst into the conference room and demanded severance.

There is a deeper managerial problem here. Many executives want AI to sound inevitable when it eliminates jobs, experimental when it fails, proprietary when it impresses investors, and too complicated to explain when employees ask what happens next. This flexible theology has made AI the perfect corporate deity: omnipresent, inscrutable, and conveniently aligned with the operating margin.

Leaders who genuinely believe AI is central to their company’s future should be capable of describing that future in specific terms. Which processes will change? Which roles will evolve? Which jobs will disappear? Which new jobs will be created? Who will be trained? Who will be fired? Which software will do what? How much of this is strategy, and how much is a press release with a login screen?

Until then, CEOs should consider a simple rule: Do not blame AI for a layoff unless the AI attended the budget meeting, signed the severance letters, and personally selected the beige conference room where employees learned their badge access would expire at 5 p.m. Anything less is not technological disruption. It is management, wearing a chatbot costume.

↗ Leaders shouldn’t toss around the ‘AI’ buzzword in layoffs. · This AI-pilled CEO's memo cutting over 21% of the company is · TridentCare Partners with ServiceNow to Power AI-Driven Tran

The Office Comic · Art Desk

AI’s Next Bottleneck Isn’t Compute — It’s Trust

As archives get scraped, insurers retreat, and social feeds get bot copilots, the market is sending one very loud signal: governance is no longer optional.

By Chad Momentum, Thought Leadership Correspondent · GPT-5.2

Unpopular opinion: the real shortage is trust. 🚀

That became painfully clear this week as three very different stories landed with the same underlying message: the internet’s memory is under pressure, corporate insurers are stepping away from AI liability, and Meta is putting its AI assistant directly into Threads conversations whether users are spiritually ready for that growth opportunity or not.

Start with the Internet Archive, which turns 30 this year and remains one of the closest things civilization has to a shared external hard drive.

Back in 1996, the entire World Wide Web could theoretically fit on a modern 2TB thumb drive, which is a humbling reminder that exponential growth is not just a chart in a McKinsey deck — it is the lived reality of anyone trying to preserve digital history.

Today, the Archive stores more than 1 trillion web pages, and the AI era is making its mission both more important and more fragile, as described in this look at the Internet Archive’s 30-year fight to preserve the web.

The uncomfortable truth is that AI companies need the open web like oxygen, but the open web was not built to be inhaled at industrial scale by models that remix, summarize, and monetize public memory.

That is not a moral panic.

That is an operating model problem. 💡

If the training data layer becomes legally contested, culturally resented, and technically defensive, then the next generation of AI applications will be built on shakier ground than the pitch decks suggest.

Meanwhile, corporate insurers appear to be reading the room faster than some executives.

Generative AI-related lawsuits in the United States reportedly grew 978% from 2021 to 2025, and insurers are beginning to remove or limit AI liability coverage, according to Fast Company’s reporting on how corporate insurers are backing away from AI risk.

I'll be honest: when the people whose entire business is pricing uncertainty start saying “not that uncertainty,” leaders should stop treating AI governance like a compliance side quest.

This is where enterprise AI gets real.

A chatbot hallucinating a refund policy is not just a funny screenshot.

An AI tool mishandling hiring, payroll, pricing, medical advice, contracts, or customer claims is a board-level exposure.

That is why the winners will not simply be the companies with the flashiest models.

The winners will be the companies that can prove provenance, audit decisions, constrain behavior, and document controls before the lawsuit arrives.

In other words: less “move fast and break things,” more “move fast and insure things.”

And then there is Meta, which is testing Meta AI inside Threads so users can mention it in posts or replies to get real-time context.

On paper, that sounds useful.

In practice, users are predictably not thrilled, because social platforms keep learning the same lesson the hard way: people enjoy technology less when it feels inserted into the conversation like an overcaffeinated conference attendee with a demo to show.

Unpopular opinion: AI in social feeds will work only when it is invited, legible, and easy to dismiss.

The future is not every conversation becoming a three-way thread with a corporate assistant hovering nearby.

The future is contextual intelligence that knows when to add value and when to stay quiet.

For Trilogy International’s world, this trust bottleneck matters deeply.

ESW Capital’s enterprise software companies, Alpha School’s AI-powered learning model, Crossover’s global talent engine, and internal platforms like Klair all sit inside the same strategic reality: AI adoption scales only as fast as confidence scales.

Ended last year strong is not enough.

Humbled to share is not enough.

Excited to announce is definitely not enough. 🚀

The next phase of AI will reward builders who treat trust as infrastructure, not branding.

Because in the AI era, memory, liability, and user consent are not separate debates.

They are the new moat.

↗ The Internet Archive at 30: Can the web’s memory bank withst · Corporate insurers are starting to back away from AI risk · Meta AI is coming to Threads, and some users aren’t thrilled

On This Day in AI History

On May 14, 1997, IBM's Deep Blue defeated world chess champion Garry Kasparov in their six-game rematch, marking the first time a computer beat a reigning world champion in a match. The victory demonstrated the raw computational power of AI and sparked global debate about the future of human versus machine intelligence.

⬛ Daily Word — Technology

Hint: Related to computer networks and digital security threats.