πŸ”¨ The Anvil

Wednesday, May 6, 2026

14 stories · Standard format

Generated with AI from public sources. Verify before relying on for decisions.

🎧 Listen to this briefing or subscribe as a podcast →

Today on The Anvil: Project Freedom pauses after Iran strikes the UAE β€” and new intel says Iran kept more than half its missiles despite Pentagon claims of 82% destruction. Agentic-coding infrastructure consolidates around team-level orchestration (Blitzy at $1.4B, JetBrains Air), AI design system drift gets empirically documented, exposed AI endpoints surface at internet scale, and a Spokane survey finds two-thirds of residents considering leaving the state.

Iran Conflict

Trump Pauses Project Freedom After Iran Strikes UAE; Operation Lasted ~48 Hours

Day 67–68 update on the Hormuz conflict (covered here since Day 54): Project Freedom β€” launched May 4 with 15,000 CENTCOM personnel and 100+ aircraft to escort merchant ships through Hormuz β€” was paused by Trump on May 5 'by mutual agreement' via Pakistani mediators after just ~48 hours of active operations. In that window: U.S. forces sank six Iranian fast-attack craft on day one, Iran then fired 15 missiles and 4 drones at the UAE (the Fujairah Petroleum Industries Zone attack that killed three Indian nationals and ignited fires), and targeted an Emirati state oil tanker β€” the first Iranian strike on UAE since the April 8 ceasefire. The Iranian port blockade remains intact. Saudi Arabia, Qatar, Kuwait, Bahrain, Jordan, the GCC, EU, UK, France, Canada, and Germany all condemned the UAE strikes. Iranian state media and the parliamentary speaker framed the pause as a U.S. retreat. Rubio declared Operation Epic Fury complete.

The 48-hour cycle hardens a new operational pattern: kinetic pressure and diplomacy now run in parallel as continuous-pressure tools rather than sequential phases β€” consistent with the ceasefire-extension and talks-collapse arc covered in prior briefings. Two new structural signals today: Iran targeting UAE (not just U.S. assets) meaningfully changes the allied escalation calculus in ways the tanker seizures (Epaminondas, MSC Francesca) did not; and the simultaneous U.S. intel finding that Iran retains 50%+ of its ballistic missile inventory β€” against Pentagon's public 82% destruction claim β€” opens a credibility gap that will shape congressional and allied support going into the May 30 proposal window. The Hormuz toll authority from Iran's April 30 14-point proposal remains the structural flashpoint OFAC has already drawn sanctions exposure around.

Verified across 6 sources: BBC News · Reuters · Al Jazeera · NPR · Washington Post · ISW / Critical Threats

US, Gulf Allies Float UN Resolution Against Iran's Hormuz Tolls and Mining; Russia/China Veto Likely

The U.S. and Gulf allies circulated a UN Security Council draft resolution threatening Iran with sanctions if it does not halt ship attacks, stop imposing 'illegal tolls' on Hormuz transit, and disclose all mine placements. The toll provision directly targets the Hormuz monetization mechanism in Iran's April 30 14-point proposal β€” the structural feature OFAC has already formally warned creates sanctions exposure for any payer. Russian and Chinese vetoes are expected; Beijing issued its formal blocking directive against U.S. secondary sanctions on Iranian oil buyers nine days after Treasury's last round of designations.

Even with a guaranteed veto, formally codifying 'Hormuz tolls' as a sanctionable activity at the Security Council level locks in the US legal posture and gives allied flag-states cover to refuse compliance. It's the diplomatic mirror of the kinetic pause β€” pressure stays on through legal and economic channels while military operations are throttled. Watch for parallel action by IMO and London insurance markets.

Verified across 1 sources: Washington Post / AP

AI Coding & Design Tools

Blitzy Raises $200M at $1.4B Valuation for Multi-Thousand-Agent Enterprise Codebase Platform

Blitzy raised $200M led by Northzone at a $1.4B valuation. The platform reverse-engineers existing 1M–100M-line codebases into knowledge graphs, then orchestrates thousands of agents in parallel for multi-week runs. Claims 66.5% on SWE-Bench Pro and 5x engineering velocity at Fortune 500 customers in financial services, insurance, and government.

This is the institutional bet that the defensible layer in agentic coding is orchestration over legacy codebase context β€” not the model. It targets a different surface than Cursor (which focuses on the developer's active session and whose Cursor SDK public beta shipped the same week): Blitzy's moat is in code graph extraction across existing 50M-line monoliths and regulated-industry compliance. For engineering leaders, the Blitzy/Cursor/JetBrains Air/Incredibuild Islo cluster arriving simultaneously confirms that harness-layer vendor lock-in is now the primary architecture decision β€” exceeding model-choice lock-in.

Verified across 1 sources: SiliconANGLE

JetBrains Air: Multi-Agent IDE for Parallel Task Delegation Across Claude, Codex, Gemini, Junie

JetBrains released Air, a from-scratch IDE built for delegating independent tasks to multiple agents (Claude Agent, OpenAI Codex, Gemini CLI, Junie) running in parallel β€” each in isolated Local, Git Worktree, or Docker environments β€” with explicit permission modes (Ask, Auto-Edit, Plan, Full Access) and a notification system for when agents need input. macOS only at launch.

Air treats agents as delegated workers with explicit permission tiers (Ask, Auto-Edit, Plan, Full Access) and per-agent sandbox isolation β€” the first IDE-native answer to the Cursor/Railway 9-second database wipe class of incident. This is the same architectural convergence as Augment Cosmos (shared agent memory, multi-model routing), Incredibuild Islo (persistent cloud execution with hardware-level isolation), and the Cursor SDK public beta β€” all arriving the same week. The competitive signal for Cursor specifically: JetBrains Air natively routes across Claude Agent, OpenAI Codex, Gemini CLI, and Junie, making multi-model routing a baseline expectation rather than a premium feature.

Verified across 1 sources: Recca0120 (technical writeup)

Coder Agents Beta: Self-Hosted, Model-Agnostic Agentic Coding for Regulated Enterprises

Coder launched Coder Agents in beta β€” a native AI coding agent designed to run entirely on self-hosted infrastructure, with no code or model interactions sent to third parties. Supports any model provider, with centralized governance and policy enforcement.

Closes the loop on the third leg of the agentic coding stack: the API-dependent tools (Cursor, Copilot), the orchestrators (Blitzy, JetBrains Air), and now the air-gapped self-hosted alternative. For defense, healthcare, and finance teams who couldn't adopt frontier coding tools at all, this collapses the choice between 'no AI' and 'send code to OpenAI.' Pairs cleanly with DeepSeek V4 and Xiaomi MiMo open weights as the local-model substrate.

Verified across 1 sources: Globe Newswire

AI Design System Drift: Figma Make Generates Systems and Then Violates Them in the Same Session

A practical experiment with Figma Make: prompt 1 generates a design system; prompt 2 builds a dashboard from it. Element-level CSS inspection shows the dashboard violates the system's own tokens β€” hardcoded values, chart colors used as status colors, undefined spacing values, and missing critical token categories (text colors, borders, interactive states). The drift happens within a single session by the same tool.

This is empirical counter-evidence to the DESIGN.md / Tandemloop thesis covered earlier this week: strict design-system discipline is not optional even with AI tooling, and 'AI-generated system + AI-generated UI' compounds inconsistency rather than enforcing it. The specific failure modes (chart-color-as-status, hardcoded spacing, undefined spacing values, missing token categories for text colors/borders/interactive states) are auditable β€” which makes CI-style automated token-compliance scanners like the CLI validation hooks in Google's DESIGN.md spec mandatory infrastructure rather than nice-to-have. The Figma MCP server's Code Connect (covered in March) was designed to prevent exactly this design-system drift, but the Figma Make failure shows the problem persists even within a single tool's session. Practical takeaway: build the lint pass before you build the system, regardless of whether you're using Figma Make, Claude Design, or v0.

Verified across 1 sources: dev.to

AI Developments

CAISI Adds Google, Microsoft, xAI to Pre-Deployment Evaluation Roster β€” Pre-Release Vetting Becomes the Default

CAISI signed pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI on May 5, expanding prior arrangements with OpenAI and Anthropic β€” all five major frontier labs are now under voluntary pre-release evaluation. Microsoft separately announced parallel partnerships with the UK AI Security Institute, including co-developed adversarial testing methodologies and contributions to MLCommons AILuminate. This lands the same week the Trump White House EO for pre-release federal vetting of frontier models leaked β€” triggered by measurable cybersecurity exploit capability in Anthropic's Mythos and GPT-5.5.

Voluntary pre-release evaluation has now reached all five frontier labs simultaneously, converging into a de facto licensing regime regardless of whether the EO formalizes it. The Pentagon's simultaneous onboarding of eight AI firms for classified deployments (OpenAI, Google, Microsoft, Amazon, Oracle, Nvidia, SpaceX, Reflection β€” with Anthropic notably excluded) gives the approved-model registry real procurement teeth. For builders: expect lengthening timelines between training completion and API availability, plus evaluation artifacts as standard enterprise procurement requirements. The measurable exploit-capability trigger (not abstract safety framing) is what makes this durable across administrations.

Verified across 3 sources: CNBC · BBC News · Microsoft

GPT-5.5 Instant Becomes ChatGPT Default: 52.5% Hallucination Reduction, Memory Sources Surfaced

OpenAI promoted GPT-5.5 Instant to the ChatGPT default on May 5, claiming 52.5% fewer hallucinated claims on high-stakes prompts (medicine, law, finance), 37.3% fewer inaccurate claims on challenging conversations, 81.6% on CharXiv (up from 75.0% with GPT-5.3 Instant), and reduced 'moralizing.' Introduced a 'memory sources' panel showing which user context shaped each response β€” partial observability, not full auditability.

Memory sources is the more interesting feature: it's the first user-facing acknowledgment that personalized context is now a meaningful production variable that needs to be debuggable. For enterprises, the partial observability is itself a problem β€” orchestration platforms with their own audit logs will now have to reconcile against ChatGPT's separate memory surface. The hallucination numbers are a real benchmark improvement but consistent with the trajectory; the governance surface is the real news.

Verified across 3 sources: The Verge · VentureBeat · The New Stack

AI Supply Chain & Logistics

Amazon Supply Chain Services Officially Launches: Network as a Productized AWS-for-Logistics

FreightWaves confirms the formal consolidation of Amazon's freight, fulfillment, and parcel services into Amazon Supply Chain Services (ASCS) β€” the AWS-for-logistics play first covered here May 4. Amazon now offers all businesses access to its 80,000+ trailers, 24,000 intermodal containers, 100 freighter aircraft, and the same AI demand-forecasting models that drive its internal operations. P&G, 3M, Lands' End, and American Eagle are early customers.

This is the productization of Amazon's most defensible internal capability β€” inventory placement and demand positioning across a continental network. The competitive geometry shifts: traditional 3PLs and freight brokers compete on rates and routes, but Amazon now competes on network design and forecast accuracy as a service. For asset-light freight tech (Loop, FourKites, Project44), this is either a major distribution channel via integration or an existential threat depending on whether ASCS exposes the forecasting models as embeddable APIs.

Verified across 1 sources: FreightWaves

FourKites Inventory Twin and NVIDIA cuOpt: Two Production Patterns for Closing the Plan-Execute Gap

Two notable production deployments landed May 5. FourKites enhanced its Inventory Twin with AI that detects stock-out risk 14 days ahead, quantifies financial exposure by SKU and facility, ranks mitigation options, and executes approved transfers via integrated carrier booking β€” closing the S&OP-to-execution gap. Separately, NVIDIA published the architecture of its in-house cuOpt-powered multi-agent supply chain system: planning cycle compressed from 5 days to under 1 day, 83% fewer delayed orders, 6x more planning capacity. Open-sourced on GitHub via Brev Launchable. The hybrid pattern β€” LLM agents formulate optimization problems, GPU-accelerated solvers execute β€” solves the constraint-hallucination failure mode of pure LLM agents.

Both deployments point at the same architectural insight: the value isn't in any single model, it's in coupling forecast/decision agents with deterministic optimizers and live execution surfaces. NVIDIA's open-sourcing of the pattern lowers the barrier substantially β€” a logistics or manufacturing team can now stand up the reference implementation rather than build from scratch. For the broader agentic supply chain wave covered through the week (Loop, 4flow, Locus, Sparrow XPL), this is the architectural reference point.

Verified across 2 sources: Business Wire · Medium / James Fahey

Spokane & North Idaho

Two Spokane Surveys: Two-Thirds of Residents and Businesses Considering Leaving Washington

Separate surveys by the Association of Washington Businesses and Greater Spokane Inc. both found ~two-thirds of Spokane County residents and business owners considering relocating out of state, citing the estate tax hike, capital gains tax, and millionaires tax. GSI's Quality of Life Survey of 600 registered voters shows declining concern about homelessness and crime but rising concern about taxes and cost of living; pessimism about the region's direction sits at 59%. Janicki Industries β€” Sedro-Woolley aerospace and defense manufacturer β€” separately confirmed it will expand into Utah, Idaho, and Montana rather than Washington, citing regulatory burden, while reducing its Seattle footprint. Idaho's population grew 10% since 2020 (second-fastest in the US); Washington and Oregon both grew under 1%.

Three independent data sources β€” two surveys, one explicit corporate decision, one Census trend β€” all point the same vector: capital and labor moving from Washington into Idaho and Montana. For Spokane specifically, this is the structural backdrop behind the gas-price record and the food-truck enforcement story. For Coeur d'Alene and Kootenai County, it's the demand signal driving the resort expansion, the Sherman Tower buildout, and Redman's $250K PAC bet on consolidating legislative control. The Inland Northwest economic story for the next 24 months is being shaped right now by which side of the state line a household sits on.

Verified across 4 sources: Spokesman-Review · KHQ · KVI · KHQ (Idaho population)

Rathdrum Mayor Hill Resigns Amid Domestic Battery Investigation; Council Picks Replacement by May 13

Rathdrum Mayor Mike Hill resigned May 4 after Rathdrum police requested a domestic battery investigation by the Kootenai County Sheriff's Office. Hill has not been arrested; the investigation is ongoing. Council President John Hodgkins assumes mayoral duties until council selects a permanent replacement by May 13. The transition lands during active budget and code discussions.

Mid-cycle leadership succession in a Kootenai County municipality during the same week Rep. Redman drops $250K to consolidate legislative control through the May 19 Idaho primary. North Idaho's local political infrastructure is undergoing structural turnover concurrent with significant infrastructure spend (I-90 widening, Sherman Tower, Rocky Point Wildlife Crossing) β€” watch council appointment dynamics for early read on the post-Hill governance posture.

Verified across 1 sources: KXLY

Newport Beach

Coastal OC Housing Holds Tight: Pending Sales +32% in 17 Days Despite National Cooling

Coastal Orange County housing β€” Newport Beach, Corona del Mar, Laguna Beach, Laguna Niguel, Dana Point β€” remains highly competitive in spring 2026. Pending sales in fast-moving segments jumped 32% in 17 days. Structural drivers: limited buildable land, restrictive zoning, large cash-buyer population. Only 18% of OC households can afford the median-priced home. Adjacent realtor.com data shows California ADU/granny-flat listings concentrated in coastal high-cost metros, with multigenerational properties commanding a 65% national price premium.

The Lido Isle $10M lot trade and Bahnsen RIA acquisition (covered yesterday) were not isolated β€” they're consistent with a coastal OC market where supply scarcity decouples from national rate-driven cooling. The ADU premium is the more interesting design-engineering signal: zoning-driven demand for flexible multi-unit residential is a real product opportunity for builders working at the physical-digital seam (smart-home suites, NextGen revenue-generating units, modular ADU systems).

Verified across 2 sources: Missy Sells OC · Realtor.com

OSINT & Intelligence

Bishop Fox AIMap and Intruder Million-Host Scan: 91% of Exposed MCP Servers and 31% of Ollama Have No Auth

Bishop Fox released AIMap, an open-source platform that discovers exposed AI infrastructure (Ollama, MCP endpoints, inference proxies) at internet scale, fingerprints them, scores vulnerability, and runs protocol-specific attack tests. Initial scan: 175,000+ exposed Ollama instances, 8,000+ open MCP servers, 91% lacking authentication. Independent confirmation from Intruder, which scanned 2M hosts (1M+ exposed AI services) and found 31% of 5,200+ Ollama servers responding to unauthenticated API queries β€” plus widespread exposed API keys and misconfigured agent platforms enabling code execution.

Self-hosting LLMs to keep data in-house is producing a larger exposure surface than the API dependency it's meant to replace. The MCP exposure number is particularly alarming because MCP is the connective tissue that lets agents reach tools and data β€” an unauthenticated MCP endpoint is effectively a remote code execution primitive on whatever the agent has access to. Pairs with this week's CLI-Anything / OpenClaw skill-poisoning research as the second supply-chain layer current scanners don't cover. Treat AIMap as the nmap-equivalent for the AI deployment era.

Verified across 2 sources: Help Net Security · The Hacker News


The Big Picture

Agentic coding moves from IDE feature to team infrastructure Blitzy's $1.4B raise, JetBrains Air's parallel-agent IDE, Coder Agents' self-hosted platform, Claude Code's .claude/commands standardization, and Opsera+Cursor's embedded guardrails all ship the same week. The pattern: agents are no longer 'AI in your editor' β€” they're orchestrated workers with sandboxes, permissions, and team-level workflow definitions.

Pre-deployment AI evaluation becomes a federal default CAISI added Google, Microsoft, and xAI to its pre-release evaluation roster (joining OpenAI and Anthropic) the same week the Trump WH executive order on frontier model pre-vetting leaked. Pre-release safety testing is consolidating into a de facto licensing regime regardless of administration rhetoric.

Hormuz is now a managed crisis, not a crisis being resolved Project Freedom launched, Iran hit the UAE within 48 hours, Trump paused the operation citing 'great progress' while keeping the Iranian-port blockade in place. Meanwhile US intel reaffirms Iran's nuclear timeline is unchanged. The pattern: kinetic operations and diplomacy now run in parallel as continuous-pressure tools, not sequential phases.

AI deployment is outrunning AI security Bishop Fox's AIMap and Intruder's million-host scan both surface the same finding within 24 hours: 91% of exposed MCP servers and 31% of Ollama instances have no auth. Self-hosting AI to protect data is creating a larger exposure surface than the API dependency it's meant to replace.

Inland Northwest economic anxiety is now measurable Two separate surveys (AWB, Greater Spokane Inc.) converge on two-thirds of Spokane County residents/businesses considering leaving Washington, Janicki Industries explicitly choosing Idaho/Montana/Utah for expansion, and gas at record highs. Idaho's 10% population growth since 2020 is the receiving end of the same vector.

What to Expect

2026-05-07 OC Business Expo at Renaissance Newport Beach Hotel β€” 1,500+ entrepreneurs, 100+ exhibitors
2026-05-13 Rathdrum City Council selects permanent mayor replacement following Mike Hill resignation
2026-05-19 Idaho legislative primary β€” Redman PAC has spent $199K supporting 20 GOP candidates
2026-05-30 Iran's 30-day proposal window closes β€” next negotiating inflection point
2026-06-01 GitHub Copilot AI Credits cutover β€” usage-based pricing replaces flat tiers

Every story, researched.

Every story verified across multiple sources before publication.

🔍

Scanned

Across multiple search engines and news databases

870
📖

Read in full

Every article opened, read, and evaluated

169

Published today

Ranked by importance and verified across sources

14

β€” The Anvil

πŸŽ™ Listen as a podcast

Subscribe in your favorite podcast app to get each new briefing delivered automatically as audio.

Apple Podcasts
Library tab β†’ β€’β€’β€’ menu β†’ Follow a Show by URL β†’ paste
Overcast
+ button β†’ Add URL β†’ paste
Pocket Casts
Search bar β†’ paste URL
Castro, AntennaPod, Podcast Addict, Castbox, Podverse, Fountain
Look for Add by URL or paste into search

Spotify isn’t supported yet β€” it only lists shows from its own directory. Let us know if you need it there.