Today on The Anvil: Project Freedom pauses after Iran strikes the UAE β and new intel says Iran kept more than half its missiles despite Pentagon claims of 82% destruction. Agentic-coding infrastructure consolidates around team-level orchestration (Blitzy at $1.4B, JetBrains Air), AI design system drift gets empirically documented, exposed AI endpoints surface at internet scale, and a Spokane survey finds two-thirds of residents considering leaving the state.
Day 67β68 update on the Hormuz conflict (covered here since Day 54): Project Freedom β launched May 4 with 15,000 CENTCOM personnel and 100+ aircraft to escort merchant ships through Hormuz β was paused by Trump on May 5 'by mutual agreement' via Pakistani mediators after just ~48 hours of active operations. In that window: U.S. forces sank six Iranian fast-attack craft on day one, Iran then fired 15 missiles and 4 drones at the UAE (the Fujairah Petroleum Industries Zone attack that killed three Indian nationals and ignited fires), and targeted an Emirati state oil tanker β the first Iranian strike on UAE since the April 8 ceasefire. The Iranian port blockade remains intact. Saudi Arabia, Qatar, Kuwait, Bahrain, Jordan, the GCC, EU, UK, France, Canada, and Germany all condemned the UAE strikes. Iranian state media and the parliamentary speaker framed the pause as a U.S. retreat. Rubio declared Operation Epic Fury complete.
Why it matters
The 48-hour cycle hardens a new operational pattern: kinetic pressure and diplomacy now run in parallel as continuous-pressure tools rather than sequential phases β consistent with the ceasefire-extension and talks-collapse arc covered in prior briefings. Two new structural signals today: Iran targeting UAE (not just U.S. assets) meaningfully changes the allied escalation calculus in ways the tanker seizures (Epaminondas, MSC Francesca) did not; and the simultaneous U.S. intel finding that Iran retains 50%+ of its ballistic missile inventory β against Pentagon's public 82% destruction claim β opens a credibility gap that will shape congressional and allied support going into the May 30 proposal window. The Hormuz toll authority from Iran's April 30 14-point proposal remains the structural flashpoint OFAC has already drawn sanctions exposure around.
The U.S. and Gulf allies circulated a UN Security Council draft resolution threatening Iran with sanctions if it does not halt ship attacks, stop imposing 'illegal tolls' on Hormuz transit, and disclose all mine placements. The toll provision directly targets the Hormuz monetization mechanism in Iran's April 30 14-point proposal β the structural feature OFAC has already formally warned creates sanctions exposure for any payer. Russian and Chinese vetoes are expected; Beijing issued its formal blocking directive against U.S. secondary sanctions on Iranian oil buyers nine days after Treasury's last round of designations.
Why it matters
Even with a guaranteed veto, formally codifying 'Hormuz tolls' as a sanctionable activity at the Security Council level locks in the US legal posture and gives allied flag-states cover to refuse compliance. It's the diplomatic mirror of the kinetic pause β pressure stays on through legal and economic channels while military operations are throttled. Watch for parallel action by IMO and London insurance markets.
Blitzy raised $200M led by Northzone at a $1.4B valuation. The platform reverse-engineers existing 1Mβ100M-line codebases into knowledge graphs, then orchestrates thousands of agents in parallel for multi-week runs. Claims 66.5% on SWE-Bench Pro and 5x engineering velocity at Fortune 500 customers in financial services, insurance, and government.
Why it matters
This is the institutional bet that the defensible layer in agentic coding is orchestration over legacy codebase context β not the model. It targets a different surface than Cursor (which focuses on the developer's active session and whose Cursor SDK public beta shipped the same week): Blitzy's moat is in code graph extraction across existing 50M-line monoliths and regulated-industry compliance. For engineering leaders, the Blitzy/Cursor/JetBrains Air/Incredibuild Islo cluster arriving simultaneously confirms that harness-layer vendor lock-in is now the primary architecture decision β exceeding model-choice lock-in.
JetBrains released Air, a from-scratch IDE built for delegating independent tasks to multiple agents (Claude Agent, OpenAI Codex, Gemini CLI, Junie) running in parallel β each in isolated Local, Git Worktree, or Docker environments β with explicit permission modes (Ask, Auto-Edit, Plan, Full Access) and a notification system for when agents need input. macOS only at launch.
Why it matters
Air treats agents as delegated workers with explicit permission tiers (Ask, Auto-Edit, Plan, Full Access) and per-agent sandbox isolation β the first IDE-native answer to the Cursor/Railway 9-second database wipe class of incident. This is the same architectural convergence as Augment Cosmos (shared agent memory, multi-model routing), Incredibuild Islo (persistent cloud execution with hardware-level isolation), and the Cursor SDK public beta β all arriving the same week. The competitive signal for Cursor specifically: JetBrains Air natively routes across Claude Agent, OpenAI Codex, Gemini CLI, and Junie, making multi-model routing a baseline expectation rather than a premium feature.
Coder launched Coder Agents in beta β a native AI coding agent designed to run entirely on self-hosted infrastructure, with no code or model interactions sent to third parties. Supports any model provider, with centralized governance and policy enforcement.
Why it matters
Closes the loop on the third leg of the agentic coding stack: the API-dependent tools (Cursor, Copilot), the orchestrators (Blitzy, JetBrains Air), and now the air-gapped self-hosted alternative. For defense, healthcare, and finance teams who couldn't adopt frontier coding tools at all, this collapses the choice between 'no AI' and 'send code to OpenAI.' Pairs cleanly with DeepSeek V4 and Xiaomi MiMo open weights as the local-model substrate.
A practical experiment with Figma Make: prompt 1 generates a design system; prompt 2 builds a dashboard from it. Element-level CSS inspection shows the dashboard violates the system's own tokens β hardcoded values, chart colors used as status colors, undefined spacing values, and missing critical token categories (text colors, borders, interactive states). The drift happens within a single session by the same tool.
Why it matters
This is empirical counter-evidence to the DESIGN.md / Tandemloop thesis covered earlier this week: strict design-system discipline is not optional even with AI tooling, and 'AI-generated system + AI-generated UI' compounds inconsistency rather than enforcing it. The specific failure modes (chart-color-as-status, hardcoded spacing, undefined spacing values, missing token categories for text colors/borders/interactive states) are auditable β which makes CI-style automated token-compliance scanners like the CLI validation hooks in Google's DESIGN.md spec mandatory infrastructure rather than nice-to-have. The Figma MCP server's Code Connect (covered in March) was designed to prevent exactly this design-system drift, but the Figma Make failure shows the problem persists even within a single tool's session. Practical takeaway: build the lint pass before you build the system, regardless of whether you're using Figma Make, Claude Design, or v0.
CAISI signed pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI on May 5, expanding prior arrangements with OpenAI and Anthropic β all five major frontier labs are now under voluntary pre-release evaluation. Microsoft separately announced parallel partnerships with the UK AI Security Institute, including co-developed adversarial testing methodologies and contributions to MLCommons AILuminate. This lands the same week the Trump White House EO for pre-release federal vetting of frontier models leaked β triggered by measurable cybersecurity exploit capability in Anthropic's Mythos and GPT-5.5.
Why it matters
Voluntary pre-release evaluation has now reached all five frontier labs simultaneously, converging into a de facto licensing regime regardless of whether the EO formalizes it. The Pentagon's simultaneous onboarding of eight AI firms for classified deployments (OpenAI, Google, Microsoft, Amazon, Oracle, Nvidia, SpaceX, Reflection β with Anthropic notably excluded) gives the approved-model registry real procurement teeth. For builders: expect lengthening timelines between training completion and API availability, plus evaluation artifacts as standard enterprise procurement requirements. The measurable exploit-capability trigger (not abstract safety framing) is what makes this durable across administrations.
OpenAI promoted GPT-5.5 Instant to the ChatGPT default on May 5, claiming 52.5% fewer hallucinated claims on high-stakes prompts (medicine, law, finance), 37.3% fewer inaccurate claims on challenging conversations, 81.6% on CharXiv (up from 75.0% with GPT-5.3 Instant), and reduced 'moralizing.' Introduced a 'memory sources' panel showing which user context shaped each response β partial observability, not full auditability.
Why it matters
Memory sources is the more interesting feature: it's the first user-facing acknowledgment that personalized context is now a meaningful production variable that needs to be debuggable. For enterprises, the partial observability is itself a problem β orchestration platforms with their own audit logs will now have to reconcile against ChatGPT's separate memory surface. The hallucination numbers are a real benchmark improvement but consistent with the trajectory; the governance surface is the real news.
FreightWaves confirms the formal consolidation of Amazon's freight, fulfillment, and parcel services into Amazon Supply Chain Services (ASCS) β the AWS-for-logistics play first covered here May 4. Amazon now offers all businesses access to its 80,000+ trailers, 24,000 intermodal containers, 100 freighter aircraft, and the same AI demand-forecasting models that drive its internal operations. P&G, 3M, Lands' End, and American Eagle are early customers.
Why it matters
This is the productization of Amazon's most defensible internal capability β inventory placement and demand positioning across a continental network. The competitive geometry shifts: traditional 3PLs and freight brokers compete on rates and routes, but Amazon now competes on network design and forecast accuracy as a service. For asset-light freight tech (Loop, FourKites, Project44), this is either a major distribution channel via integration or an existential threat depending on whether ASCS exposes the forecasting models as embeddable APIs.
Two notable production deployments landed May 5. FourKites enhanced its Inventory Twin with AI that detects stock-out risk 14 days ahead, quantifies financial exposure by SKU and facility, ranks mitigation options, and executes approved transfers via integrated carrier booking β closing the S&OP-to-execution gap. Separately, NVIDIA published the architecture of its in-house cuOpt-powered multi-agent supply chain system: planning cycle compressed from 5 days to under 1 day, 83% fewer delayed orders, 6x more planning capacity. Open-sourced on GitHub via Brev Launchable. The hybrid pattern β LLM agents formulate optimization problems, GPU-accelerated solvers execute β solves the constraint-hallucination failure mode of pure LLM agents.
Why it matters
Both deployments point at the same architectural insight: the value isn't in any single model, it's in coupling forecast/decision agents with deterministic optimizers and live execution surfaces. NVIDIA's open-sourcing of the pattern lowers the barrier substantially β a logistics or manufacturing team can now stand up the reference implementation rather than build from scratch. For the broader agentic supply chain wave covered through the week (Loop, 4flow, Locus, Sparrow XPL), this is the architectural reference point.
Separate surveys by the Association of Washington Businesses and Greater Spokane Inc. both found ~two-thirds of Spokane County residents and business owners considering relocating out of state, citing the estate tax hike, capital gains tax, and millionaires tax. GSI's Quality of Life Survey of 600 registered voters shows declining concern about homelessness and crime but rising concern about taxes and cost of living; pessimism about the region's direction sits at 59%. Janicki Industries β Sedro-Woolley aerospace and defense manufacturer β separately confirmed it will expand into Utah, Idaho, and Montana rather than Washington, citing regulatory burden, while reducing its Seattle footprint. Idaho's population grew 10% since 2020 (second-fastest in the US); Washington and Oregon both grew under 1%.
Why it matters
Three independent data sources β two surveys, one explicit corporate decision, one Census trend β all point the same vector: capital and labor moving from Washington into Idaho and Montana. For Spokane specifically, this is the structural backdrop behind the gas-price record and the food-truck enforcement story. For Coeur d'Alene and Kootenai County, it's the demand signal driving the resort expansion, the Sherman Tower buildout, and Redman's $250K PAC bet on consolidating legislative control. The Inland Northwest economic story for the next 24 months is being shaped right now by which side of the state line a household sits on.
Rathdrum Mayor Mike Hill resigned May 4 after Rathdrum police requested a domestic battery investigation by the Kootenai County Sheriff's Office. Hill has not been arrested; the investigation is ongoing. Council President John Hodgkins assumes mayoral duties until council selects a permanent replacement by May 13. The transition lands during active budget and code discussions.
Why it matters
Mid-cycle leadership succession in a Kootenai County municipality during the same week Rep. Redman drops $250K to consolidate legislative control through the May 19 Idaho primary. North Idaho's local political infrastructure is undergoing structural turnover concurrent with significant infrastructure spend (I-90 widening, Sherman Tower, Rocky Point Wildlife Crossing) β watch council appointment dynamics for early read on the post-Hill governance posture.
Coastal Orange County housing β Newport Beach, Corona del Mar, Laguna Beach, Laguna Niguel, Dana Point β remains highly competitive in spring 2026. Pending sales in fast-moving segments jumped 32% in 17 days. Structural drivers: limited buildable land, restrictive zoning, large cash-buyer population. Only 18% of OC households can afford the median-priced home. Adjacent realtor.com data shows California ADU/granny-flat listings concentrated in coastal high-cost metros, with multigenerational properties commanding a 65% national price premium.
Why it matters
The Lido Isle $10M lot trade and Bahnsen RIA acquisition (covered yesterday) were not isolated β they're consistent with a coastal OC market where supply scarcity decouples from national rate-driven cooling. The ADU premium is the more interesting design-engineering signal: zoning-driven demand for flexible multi-unit residential is a real product opportunity for builders working at the physical-digital seam (smart-home suites, NextGen revenue-generating units, modular ADU systems).
Bishop Fox released AIMap, an open-source platform that discovers exposed AI infrastructure (Ollama, MCP endpoints, inference proxies) at internet scale, fingerprints them, scores vulnerability, and runs protocol-specific attack tests. Initial scan: 175,000+ exposed Ollama instances, 8,000+ open MCP servers, 91% lacking authentication. Independent confirmation from Intruder, which scanned 2M hosts (1M+ exposed AI services) and found 31% of 5,200+ Ollama servers responding to unauthenticated API queries β plus widespread exposed API keys and misconfigured agent platforms enabling code execution.
Why it matters
Self-hosting LLMs to keep data in-house is producing a larger exposure surface than the API dependency it's meant to replace. The MCP exposure number is particularly alarming because MCP is the connective tissue that lets agents reach tools and data β an unauthenticated MCP endpoint is effectively a remote code execution primitive on whatever the agent has access to. Pairs with this week's CLI-Anything / OpenClaw skill-poisoning research as the second supply-chain layer current scanners don't cover. Treat AIMap as the nmap-equivalent for the AI deployment era.
Agentic coding moves from IDE feature to team infrastructure Blitzy's $1.4B raise, JetBrains Air's parallel-agent IDE, Coder Agents' self-hosted platform, Claude Code's .claude/commands standardization, and Opsera+Cursor's embedded guardrails all ship the same week. The pattern: agents are no longer 'AI in your editor' β they're orchestrated workers with sandboxes, permissions, and team-level workflow definitions.
Pre-deployment AI evaluation becomes a federal default CAISI added Google, Microsoft, and xAI to its pre-release evaluation roster (joining OpenAI and Anthropic) the same week the Trump WH executive order on frontier model pre-vetting leaked. Pre-release safety testing is consolidating into a de facto licensing regime regardless of administration rhetoric.
Hormuz is now a managed crisis, not a crisis being resolved Project Freedom launched, Iran hit the UAE within 48 hours, Trump paused the operation citing 'great progress' while keeping the Iranian-port blockade in place. Meanwhile US intel reaffirms Iran's nuclear timeline is unchanged. The pattern: kinetic operations and diplomacy now run in parallel as continuous-pressure tools, not sequential phases.
AI deployment is outrunning AI security Bishop Fox's AIMap and Intruder's million-host scan both surface the same finding within 24 hours: 91% of exposed MCP servers and 31% of Ollama instances have no auth. Self-hosting AI to protect data is creating a larger exposure surface than the API dependency it's meant to replace.
Inland Northwest economic anxiety is now measurable Two separate surveys (AWB, Greater Spokane Inc.) converge on two-thirds of Spokane County residents/businesses considering leaving Washington, Janicki Industries explicitly choosing Idaho/Montana/Utah for expansion, and gas at record highs. Idaho's 10% population growth since 2020 is the receiving end of the same vector.
What to Expect
2026-05-07—OC Business Expo at Renaissance Newport Beach Hotel β 1,500+ entrepreneurs, 100+ exhibitors
2026-05-13—Rathdrum City Council selects permanent mayor replacement following Mike Hill resignation